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ABSTRACT 


Scheduling analysis is one of the most important activities in hard real-time 
systems development since the conecmess of hard real-time systems depends not only on 
the logical results of computation, but also on the time at which the results are produced. 
This dissertation aimed at the development of both fundamental theoiy and software tools 
to support efficiently and reliably the scheduling of distributed hard real-time systems. The 
major work of this dissertation focuses on non-preeiiq)tive hard real-time scheduling, for 
periodic and sporadic task sets, although some of the results are also applicable to the 
preen: 5 )tive case. 

Several theorems for checking the schedulability of non-preemptive task sets are 
developed. Previous results on necessary and sufficient conditions for scheduling non- 
preemptive task sets are extended to cover the case when the task deadlines can be smaller 
or equal to their periods. The concept of transient and cyclic schedules is introduced to 
overcome the weakness of the traditional methods, which restrict the construction of a 
cyclic schedule to a fixed interval of length equal to the least common multiple of the 
periods. An algorithm for reducing the schedule length of periodic task sets is developed 
to further enhance the schedulability of the hard real-time systems. Preliminary study on 
randomly graphs shows that the algorithm do produce near-optimal solution. 

To ease the problem of synchronization among tasks in distributed hard real-time 
systems, we introduce the Fundamental Synchronization Theorem and a novel model for 
designing distributed hard real-time systems without explicit synchronization, and develop 
an Ada95 software architecture to support such a model The application of this theorem 
wfll allow us to treat each set of tasks allocated to a particular processor, as a totally 
independent set, if the tasks satisfy the conditions described in the theorem. This approach 
will greatly decrease the difficulties in scheduling large distributed real-time systems. 

One of the necessary steps in distributed hard real-time scheduling is the allocation 
of tasks to different processors in the distributed system. Algorithms for task allocation 
which minimize the inter-module communication costs are developed and implemented. 

Hnally, a timing model for handling different time references in rapid prototyping 
systems is introduced, to support the reuse of real-time components. 


V 



VI 



TABLE OF CONTENTS 

1. INTRODUCTION TO HARD REAL-TIME SYSTEMS. 

A. INTRODUCTION. 

B. REVIEW OF PREVIOUS WORK. 

1. Preemptive Static Scheduling. 

2. Non-Pteemptive Static Scheduling. 

3. Summaiy of Scheduling Complexity. 

4. A Brief Note about the Perio^c Task Complexity. 

5. Con^lexity Results for Message Routing in Distributed Systems 


n. CAPS AND PSDL OVERVIEW. 

A. MOTIVATION. 

B. THE WATERFALL MODEL. 

C. THE SPIRAL MODEL. 

D. THE COMPUTER AIDED PROTOTYPING SYSTEM (CAPS).. 

1. CAPS Tools. 

a. The PSDL Editor. 

b. The Text Editor. 

c. The Interface Editor. 

d. The Requirements Editor. 

e. The Change Request Editor. 

f. The Translator. 

g. The Scheduler. 

h. The Compiler. 

L The Evolution Control System. 

j. The Merger. 

k. The Software Base. 

E. THE PROTOTYPING SYSTEM DESIGN LANGUAGE (PSDL). 

1. PSDL Computational Model. 

a. Curators. 

b. Data Streams. 

c. State Streams. 


e. Exceptions... 

f. llmers.............._...................... 

2. Control Abstractions... 

a. Periodic and Sporadic Operators 

b. Data Triggers.. 

c. Execution Guards.. 

d. Conditional Output.. 

3. Timing Constraints. 

4. A PSDL Prototype Example.. 












































m. FUNDAMENTAL ISSUES IN REAL-TIME SCHEDULING. 39 

A. THE SCHEDULING MODEL AND SOME DEFINITIONS..... 39 

B. CONDITIONS FOR SCHEDULABIUTY OF NON-PREEMFITVE TASKS.. 42 

1. The Maximum Execution Time Theorem.42 

2 . The Finish-Within Theorem. 45 

3. The Minimum Period Theorems. 45 

4. The Load Factor Theorem. 47 

5. The Task Demand Theorem. 4 g 

C. THE HARMONIC BLOCK DILEMMA......1....!..... 53 

D. A NOTE ABOUT PRECEDENCE CONSTRAINTS. 57 

E. COPING WITH APERIODIC TASKS. 59 

1 . The Conversion.. 

2 . Important Remarks about the Conversion......65 

3. In^lementation Issues about the Conversion.67 

IV. DISTTUBUTED SCHEDULING. 69 

A. INTRODUCnON.69 

B. ARCHITECTURAL ISSUES.70 

1 . Different Qocks.. 

2 . Speed of CPUs.. 

3. Memory.. 

4. The Communication Media. 71 

5. Interconnectivity. 71 

C. THE PROBLEM STATEMENT.!....!...”!.l......*.71 

D. SYNCHRONIZATION IN PSDL. 73 

E. DEALING WITH SPEQAL CASES. 74 

F. TACKLING THE SYNCHRONIZATION PROBLEM..I!!!.”".!!." 81 

1 . Additional Restrictions Imposed on the Timing Constraints. 89 

G. THE TASK ALLOCATION MODEL. 91 

1 . Some Basic Definitions. 94 

2. The Approach.. 

3. The Current Implementadon. 100 

V. ARCHTTECrURAL ISSUES OF THE CAPS SCHEDULER. 103 

A. THE CURRENT SCHEDULER - UNIPROCESSOR ARCHITECTURE... 103 

1. Data Triggers. IO 5 

2 . Execution Triggers.IO 7 

3. Output Guards. lOg 

B. THE PROPOSED DISTRIBUTED ARCHITECTUlS.." ..!.!..!.!.”!!*!*! 110 

C. IMPLEMENTATION ISSUES OFTHE COMMUNICATION SUBSYSTEM... 114 

1 . The RPC Model... 

2. The First Approach... 


viii 








































3. The Ada95 Approach.118 

a. The Package Streams.120 

b. Conclusions.122 

D. CPU SPEED RATIO ISSUES IN A PROTOTYPING ENVIRONMENT.... 124 

1. Choosing a Reference.125 

2. CAPS Timing Model.126 

a. Building the Prototype.127 

b. Installing Components in the Software Base.127 

3. Relations between CPU Speed Ratio and Timing Eirors.128 

4. How the CPU Speed Ratio affects Scheduling.130 

5. Handling Unwanted Interactions during Prototype Scheduling.131 

VI. EXPERIMENTAL RESULTS.133 

A. INTRODUCTION.133 

B. THE RANDOM GRAPH GENERATOR.133 

C. FIRST FINDINGS AFTER USING THE RANDOM GRAPH GENERATOR.... 135 

D. MINIMIZING THE HARMONIC BLOCK.137 

E. THE NEW DISTRIBUTED SCHEDULING ALGORITHM - SOME RESULTS 140 

Vn. CONCLUSIONS AND RECOMMENDATIONS.143 

A. SUMMARY OF THE DISSERTATION..143 

B. POSSIBLE CAPS MODIHCATIONS.146 

1. Enhancing the CAPS Syntax Directed Editor (SDE).146 

2. Tasks with Soft Deadlines.146 

3. Preemptive Static Scheduling.147 

4. Triggering Conditions versus Stream Types.147 

5. Estimating the Execution Time.148 

6. The Uninitialized Sampled Stream Problem..149 

7. State Stream versus Data Flow.149 

C CONCLUSIONS.150 

UST OF REFERENCES.153 

BBUOGRAPHY.159 

INITIAL DISTRIBUTION UST.161 

































TABLE OF FIGURES AND TABLES 


Figure 1.1. Types of Task Deadlines.2 

Figure 1.2, Scheduling Taxonomy.4 

Table 1.1. Major Results in Sch^uling Algorithms.7 

Table 1.2. Summary of Non-Preemptive Scheduling Complexity.8 

Table 1.3. Complexity of The Scheduling Problem with Several Resources.8 

Table 1.4. Complexity for Non-Preemptive Transmissions.10 

Figure 2.1. The Waterfall Model.14 

Figure 2.2. The Prototyping Process.17 

Figure 13. The Spiral Model.18 

Figure 2.4. The CAPS Structure.20 

Figure 2.5. Sporadic Timing Constraints.33 

Figure 2.6. Periodic Timing Constraints.34 

Figure 2.7. The Scheduling Interval.35 

Table 2.1. Main PSDL Timing Constraints.35 

Figure 2.8. Prototype of an Autopilot.37 

Table 3.1. Summary of our Sch^uling Model.41 

Figure 3.1. Theorem 1 for the Sporadic Case.43 

Figure 3.2. Pipelining Operators.44 

Figure 3.3. The Minimum Period Sliding Window.46 

Figure 3.4. Different Task Release Time for Task X.50 

Figure 3.5. The Transient and Cyclic Schedules.54 

Figure 3.6. Determining the Start Time tc of the Cyclic Schedule.56 

Figure 3.7, The Sporadic Conversion when MCP<MRT-MET.60 

Figure 3.8. The Sporadic Conversion when MCP^MRT-MET.61 

Figure 3.9. Worst Case Situation.63 

Figure 3.10. Effects of TP on the Load Factor.65 

Figure 3.11. Restrictions in the Producer Imposed by the Consumer’s MCP.66 

Figure 4.1. Typical Radar Data.73 

Figure 4.2. Producers with Different Periods.75 

Figure 43. Potential Overflow Situation.76 

Figure 4.4. Different Stream Types Combination.76 

Figure 43. Period Incompatibility among Operators.77 

Table 4.1, PSDL Data Triggering Semantic Table.78 

Table 43. PSDL Timing Constraints Semantic Table.80 

Figure 4.6. Reason for No Synch when PERprqd ^ PERoons (Uniprocessor Case).... 82 

Figure 4.7. Reason for No Synch when PERprod < PERojns (Distributed Case).83 

Figure 4.8. Reason for No Synch when PERprod ^ PERcxjns (Distributed Case).83 

Figure 4.9. Synchronization among Periodic Operators when PWa = METa .84 

Figure 4.10. The Consumer-Producer Paradigm.87 

Figure 4.11. Seeking for an Upper-Bound.88 


XI 










































Figure 4.12. New Timing Constraints for the Sporadic Operator. 90 

Figure 4.13. The Saturation Effect. 

Table 4.3. Placement Cost Matrix. 

Table 4.4. IMC Cost Matrix. 

Table 4.5. Distance Cost Matrix. 

Figure 4.14. The Data Dependency Graph.96 

Figure 4.15. Algorithm for Calculating the IMC Cost Function.98 

Figure 4.16. Partial View of the Allocation Program.100 

Figure 5.1. Partial View of Patriot a.104 

Figure 5.2. TRIGGERED BY SOME Implementation .106 

Figure 5.3. TRIGGERED BY ALL Iii 5 )lementation.107 

Figure 5.4. TRIGGERING IF Implementation.108 

Figure 5.5. Ouq)ut Guards Inqrlementation.108 

Figure 5.6. CAPS Supervisory Program Structure.109 

Figure 5.7. The New PSDL_Streams Ada Package Specification. 112 

Figure 5.8. Body of the Network Stream Task.113 

Figure 5.9. Justification for the Header Information.114 

Figure 5.10. The RPC Programs for the New Scheduler.117 

Figure 5.11. Package System.RPC (Specification).119 

Figure 5.12. Package Ada.Streams (Specification). 121 

Figure 5.13. Stream Attributes.1^^ 

Figure 5.14. Architecture for the Distributed CAPS Scheduler.123 

Table 5.1. Default Values for the Timing Model.127 

Figure 5.15. Effect of the CPU Speed Ratio on the Schedule.131 

Figure 6.1. Partial View of the Data Structure Used to Build the Random Graph.... 134 

Figure 6.2. Algorithm for Optimizing the LCM.139 

Figure 63. Optimization Results.140 

Table 7.1. Triggering Condition and Stream Type Combinations.148 

Figure 7.1. The Uninitialized Sampled Stream Problem.149 


xii 































ACKNOWLEDGMENTS 


First and foremost, I am grateful to my fiiend, lover and wife Cnstina, and our 
children Igor and Lucas, for enduring throughout the course of this longer-than-planned 
journey. Their love, support and encouragement helped make this dissertation possible. 

Next I would like, to thank my parents Franklin and Helena for their unconditional 
love and support throughout my life. 

To my dissertation advisor, Man-Tak Shing, I would like to express ray deepest 
gratitude for all confidence, guidance and support. I will never forget the night before 
defense, after that strong rain that isolated his house, when he kept trying by all means, 
and finall y succeeded, to meet me to rehearse my presentation. 

I would like also to thank the other members of nty committee, Luqi, Amr Zaky, 
Sherif Michael and Jim Sanders for helping nne in various ways during the course of this 
research. To Yutaka Kanayama, who was the PhJ). Committee Chairman during most of 
my tour as the PhJD. Student Representative, my special thanks for his patience and 
assistance. Many thanks to my fellow PhD. students, whose friendship and support were 
very important to my success. Thanks also to the staff of the Computer Science 
Department, especially Russell Whallen, Mike A^^ams and Walter Landaker for their 
unrestricted and unremitting support 

Hnally, I would like to thank God for helping me overcome one more stage in nty 
life journey. 





I. 


INTRODUCTION TO HARD REAL-TIME SYSTEMS 


A. INTRODUCTION 

Traditionally, most real-time systems have been built for military purposes. As 
computers become faster, more inexpensive, and more reliable, a tendency towards 
automation is ernwging in virtually every field of activity. Areas in which real time 
systems are being more widely employed include manufacturing, communications, 
defense, transportation, aerospace, energy, and health care. 

“Hard real-time systems” are defined as those systems in which the correctness of 
the system depends not only on the logical results of computation, but also on the time at 
which the results are produced. They are also characterized by the fact that severe 
consequences will result if logical as well as timing correcmess properties of the system 
are not satisfied. [SR88] 

To put it briefly, real-time systems differ fix)m traditional systems in that deadlines 
or other explicit timing constraints are attached to the tasks or processes. 

Audsley and Bums presented a very interesting approach [AB93], where the time 
taken to complete a task is mapped against the value this task has to the system, 
developing the so called time-value functions. This work proposes an adaptation of then- 
approach to be used by CAPS, where the time critical tasks could have several kinds of 
deadlines, as shown in Figure 1.1. Tasks with hard deadlines may cause damage to the 
system if they start early or finish late. Tasks with soft deadlines convey the main idea of 
“better late than never”, and the tasks with hybrid d eadlin es can be assumed to have a soft 
deadline behavior until certain point in time, but then they become hard d e ad lin e tasks, 
generating damage to the systent Using this approach, it is posable to determine whether 
it is more convenient to precnqit a task that has not fiiushed within its d ea d lin e or keep it 
running. This approach provides a much better representation for a task deadline, than 
tiiat achieved by merely calling it a soft or a hard deadline. 
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In general it can be said that there are three types of tasks, depending upon their 
deadline characterisdcs. The periodic tasks that execute on a regular basis, and usually 
have a period and a required execution time. The aperiodic tasks (also known as non- 
pCTiodic) which are essentially random tasks triggered by some external event. Aperiodic 
tasks may also have some timing constraints that limit their maximum start or finish time. 
However, if aperiodic tasks are allowed to have hard deadlines (in other words, if they are 
allowed to have negative values once the deadline is missed) worst case analysis cannot be 
further discussed without further restricting their timing behavior. This is the rationale 
behind the third type of task, the sporadic task, in which a minimum period between any 
two aperiodic events is required. [AB93] 



Figure 1.1. Types of Task Deadlines 

In addition to timing constraints, a task can have other constraints, such as [SR88]: 

1) resource constraints ~ which note the resources required during the execution 
of the task 

2) precedence constraints - that specify a partial (perhaps total) ordering on the 
execution of the tasks 

3) concurrency constraints - that describe which tasks can run concurrently, to 
share, for example, a resource 
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4) placement constraints • which note whether a given task is to run in a specific 
processor 

5) criticalness - which is the relative value to the system that is associated with 
some specific task when it meets its deadline 

6) preemptiveness - determining whether a task can be interrupted by other tasks 
and resume execution afterwards 

7) communication requirements - that note issues, such as acceptable delays, for 
inter-task co mmuni cations and synchronization protocols 

Task scheduling in hard real-time systems can be either static or dynamic. In static 
scheduling it is assumed that all information about the tasks is known a priori, and the 
schedule is usually generated off-line. In dynamic scheduling, although all information 
about the tasks may be known a priori, they are allowed to be dynamically invoked, and 
the schedule is calculated “on the fly”. There has been a great deal of debate about the 
appropriateness of dynamic algorithms for hard real-time systems. Many people are in 
favor of static scheduling because it seems reasonable to assume that for safety-critical 
applications all the schedulability should be guaranteed before execution [AB93]. 

B. REVIEW OF PREVIOUS WORK 

According to Baker [Bak74], scheduling is the allocation of resources over time to 
perform a collection of tasks. This rather general definition conveys the basic idea of 
scheduling theory, which is a collection of principles, models, techniques and logical 
conclusions that provide insist into the scheduling function. 

Many of the early developments in the field of scheduling were motivated by 
problems arising in manufacturing. Today, even though scheduling is used in many 
different areas, there arc still references that deal with machines instead of processors, and 
with jobs instead of tasks. 

In order to have a better understanding of the context in which scheduling issues 
are found, it is reasonable to begin by proposing a taxonomy for the scheduling function. 
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This taxonomy is an enhancement of that proposed by Cheng, et al. [CSR87] and is 
illustrated in Figure 1.2. 

As shown in the figure, classical scheduling can be divided into four major areas: 
single-machine problems, parallel-machine, flow shop, and job shop scheduling. Most of 
these areas make use of objective functions, such as minimizing flowtime, minimising 
mean tardiness, and minimizing con^letion time (makespan), which does not convey much 
of the important information needed by real-time systems. In most of these problem areas, 
the deadline concept is not even considered. Nevatheless, some of these results can 
provide very fruitful insights into real-time scheduling problems. Another issue that is not 
considered in many of the problems associated with classical scheduling is the idea of 
periodic tasks, meaning tasks that run forever. For further reading on classical scheduling 
the reader is directed to the work of Baker [Bak74] and Stankovic, et al. [SSN93]. The 
latter reference presents a concise survey on the implications of classical scheduling results 
for real-time systems. 



Figure 1.2. Scheduling Taxonomy 


Tasks can also be distinguished as preemptable or non-prcemptablc. A task is 
preemptable if it can be interrupted by other tasks and can resume execution afterwards. 
A non-precmptable task, once started, must run to completion. 
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Another concept that requires introduction is the difference between 
multiprocessor systems and distributed systems. In multiprocessor systems, the cost of 
inteipiocessor communications is negligible, as the different processors usually have some 
kind of shared memory and a global clock. In distributed systems, the cost of 
interprocessor communications is not negligible, as the processors do not share any 
memory space and each processor has its own clock. It is now appropriate to make a brief 
review of some previous work done in hard real-time scheduling, with an emphasis on the 
results related to static scheduling. 

1. Preemptive Static Scheduling 

In cases where the tasks are periodic, which is the most common case in 

real-time systems, it can be said that the most important result for the uniprocessor case 
was provided by Liu and Layland [LL73]. They proved that the Earliest DeadUne First 
(EDF) algorithm is optimal for any set of independent periodic tasks where optimality is 
defined by the statement, “if a set of tasks can be scheduled by any algorithm, then it can 
be scheduled by the EDF algorithm”. They also demonstrated some bounds on processor 
utilization when using this algorithm. Their results were extended to cover cases where 
the release times are arbitrary by Jeffay lJef89a]. Also based on Liu and Layland s work, 
a more elaborate schedulability test was proposed by Lchoczky, ct al. ILSD89]. This test 
employed the concept of processor time demand for handling cases where the deadlines 
were smaller than the periods. Sha and Lehoezky ILS86] described a technique of 
splitting the periods so that better processor utilization could be achieved. 

llom IHor741 developed an optimal 0(n*) algorithm that was also based 
on the earliest deadline first principle. Originally formulated for non-periodic tasks, this 
algorithm proved capable of handling independent tasks with arbitrary deadlines and 
release times in a uniprocessor environment For the same type of tasks, he also 
introduced an algorithm for the multiprocessor case that was based on the network flow 
method. Manel IMar821 extended the work of Horn by allowing for processors with 
different speeds. 
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For multiprocessor scheduling of periodic tasks, most of researchers have 
adopted a partition approach, where some kind of bin-packing algorithm is used to 

determine the sub-optimal partitions. Examples can be found in the work of Davari and 
Dhall [DD 86 ], Bannister and Trivedi [BT83], and in that of Dhall and Liu [DL78]. 

2. Non-Preemptive Static Scheduling 

There has been a great deal of research in the area of preemptive real-time 
scheduling. For the non-preemptive case, however, most problems have been shown to be 
NP-hard, even in the uniprocessor case. Hence, the majority of the work that has been 
done in this area covers very specific cases, such as when unit computation times are 
involved, or when release times are the same. Moore [M 0068 ] showed that the earliest 
deadline algorithm is optimal for scheduling a set of independent tasks that have the same 
release time. Bratley, Florian and Robillard [BFR71] developed an implicit enumeration 
algorithm to determine scheduling for non-preemptive tasks with arbitrary release times 
and deadlines. Baker mid Su [BS74] used a similar approach to minimize the maximum 
tardiness of tasks. Erschler, ct al. [EFM83] developied a necessary condition for 
scheduling tasks with arbitrary release times and deadlines. When utilizing periodic task 
sets, which are definitely the major area of focus for this study, the major results can be 
found in the work of Mok [Mok83J, Xu IXP90], Jeffay [JSM91] and Zhu [ZLC94]. 

3. Summary of Scheduling Complexity 

In dealing with scheduling problems where most of the input instances have been 
proven to be NP-hard, it is very important and beneficial to know in which class a 
particular instance belongs, so that the problem can be addressed appropriately. However, 
when one looks into the huge amount of research in this area, it becomes apparent that the 
various studies are very difficult to compare. While it is undesirable to liirut the creativity 
of researchers, it is increasingly apparent that some kind of standard is needed, so that 
individual research efforts at least speak in the same language. 
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Nevertheless, this section offers a summary of the major results achieved in the 
area of time complexity of scheduling algorithms, for both the preemptive and non- 
preemptive cases. Whenever the result is applicable to penodic task sets, it will be briefly 
mentioned. 

In Table 1.1, it has been listed, for each case, the number of processors (m), the 
precedence relation (<) among the tasks (if one exists), the valid domain for the release 
time (rf), the deadline (d,), the computation time (c.), whether it is preemptive or non- 
preeirq)tive, the timp. complexity of the problem, the reference paper, and, finally, some 
additional remarks. Note that in this table most of the results are for non-periodic task 
sets. In the following section, the problem of how to apply these results to the periodic 
case is addressed. 
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Table 1.1. Major Results in Scheduling Algorithms 











































































































Table 1.2 summarizes the complexity boundaries of various non-prcemptive 
problems with respect to the number of processors, computation time, and type of partial 
order. 
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Table 1.2. Summary of Non-Preeraptive Scheduling Complexity 


Table 1.3 is very interesting in the sense that it delimits the boundaries between 
NP-completeness and polynomial solvability for the more constrained non-preemptive 
scheduling problem, where resources (Rsrc) other than processors are being requested by 
the tasks. As can be seen, by having no precedence relations, or for values of m less than 
2 in the first case, or by making m less than three in the second case, the resulting 
problems can be solved in polynomial time. [GJ75] 
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Table 1.3. Complexity of the Scheduling Problem with Several Resources 


Other important results are: 

**It is impossible to find a totally qjtimal run-time scheduler even if 
any ready process is permitted to preempt any other process in 
progress”.[Mok7 6] 

"When there are mutual exclusion constraints, it is impossible to 
find a totally on-line optimal run-time scheduler”.[Mok83] 
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“The problem of deciding whether it is possible to schedule a set of 
periodic processes which use semaphores only to enforce mutual exclusion 
isNP-hard”.[Mok83] 

“The problem of computing a static schedule for a set of periodic 
timing constraints is NP-haid”.[Mok83] 

“Non-preemptive scheduling of periodic tasks when release times 
are taken into consideration is NP-hard in the strong sense”.[JSM91] 

“The processor allocation problem is NP-complete even for the 
case whae only two processors are available and the processor scheduling 
problem resulting from any partition is easy”.[Mok83] 

“The problem of finding an optimal schedule is NP-hard for a single 
processor even if all tasks have the same ready time and deadline”.[LW90] 

4. A Brief Note about the Periodic Task Complexity 

It is very common for authors of papers that deal with the scheduling of non¬ 
periodic tasks, i.e., tasks that are executed only once, to infer that their algorithms or 
methods can also be applicable to periodic tasks by sinply applying the same algorithm to 
the set of tasks occurring within a time period that is equal to the least common multiple 
of their periods. 

Although this assertion is true in most of cases, one must note that a polynomial 
time algorithm for scheduling non-periodic tasks may take exponential time to schedule a 
set of periodic tasks using the same algorithm. To see this, consider an algcnithm A that 
schedules a set T of n non-periodic tasks in time 0(11 P), where 111 is equal to the size of 
die input instance. Qearly, by using a binary encoding, 0( n + Hog ri + Hog Cj Hog di) 
bits arc needed to encode such an instance. Now, assume a set Tof n periodic tasks with 
periods p,. p 2 ,..., p., whose input size is 0( n h- Hog r, + Hog Cj + Hog d; + Hog p.). 
Note that in the worst case an LCM of pi x p 2 x ... x p„ exists. So, in order to use 
algorithm A to schedule the periodic task set T, one must first transform T into an 
equivalent set T" of non-periodic tasks with p 2 X p 3 ...x p„ instances of task Ti, pix p 3 ...x 
p, instances of task T 2 , pjx p 2 ...x p„ instances of task T 3 , and so on. 
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Clearly, the size IF'I of the input instance T' is equal to 
0( n + 2 [(log Ti + log Ci + log di) X 

i=i p. 

and algorithm A will take OCin^) time to schedule all task instances in T". But, since II"! 

n 

^ C X ([ n + 2 (log Ti + log Ci + log di + log Pi) ] ^) for any constants C and k, 0(11"!^) 
is exponential with respect to IFl. 

5. Complexity Results for Message Routing in Distributed Systems 

This section presents some very interesting results from Leung [LTW89] regarding 
the possibility or impossibility of sending a set of messages in a distributed real-time 
system on-time. Each message M is represented by the quintuple (Si,ei4i,ri,di) where Si 
denotes the origin node for Mj, ei denotes the destination node, 1, is the length of Mi, rj is 
the release time, and d, denotes the deadline of Mi. The problem was studied for both 
preemptive and non-preemptive cases, but this discussion will be restricted to the latter. It 
is also assumed that the processors are connected by an uni-directional ring. Table 1.4 
shows the complexity results for the non-preemptive transmission. An entry marked k 
denotes that the parameter is the same for all messages, while a V entry denotes that it can 
vary according to the message. 
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Table 1.4. Complexity for Non-Preemptive Transmissions 
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As shown in Table 1.4, the message routing problem becomes NP whenever two 
or more parameters are allowed to be arbitrary. These and other results had a great 
influence on the manner in which this dissertation will treat distributed scheduling. 
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n. CAPS AND PSDL OVERVIEW 


A. MOTIVATION 

The United States Department of Defense (DoD) is currently the world’s largest 
user of computers. Each year, billions of dollars are allocated for the development and 
maintenance of progressively more conq)lex weapons and communications, and 
information systems. These systems increasingly rely on information processing, utilizing 
embedded computer systems, and are often characterized by time periods or deadlines 
within which some event must occur. Such periods or deadlines are known as “hard real¬ 
time constraints”. Satellite control systems, missile guidance systems, and commuiucations 
networks are examples of embedded systems with hard real-time constraints. The 
correcmess and reliability of these software systems is critical, making software 
development of these systems an immense task with increasingly high costs and potential 
for design errors [Boo87]. 

Over the past twenQr years, technological advances in computer hardware 
technology have reduced the hardware portion of total system cost from 85 percent to 
about 15 percent. In the early 1970s, studies showed that computer software alone 
comprised approximately 46 percent of the total estimated DoD computer costs. Of this 
cost, 56 percent was devoted specifically to embedded systems. In spite of the 
tremendous expense, most large software systems were characterized as not providing the 
functionality that was desired, taking too long to develop, costing too much time or taking 
too much space to use, and lacking the alnlity to evolve to meet the user's changing needs 
[Boo87]. 

Software engineering evolved in response to the need to more efficiently design, 
iiiq)lement, test, install, and maintain larger and more complex software systems. The 
term “software engineering” was coined in 1967 1^ a NATO study group, and endorsed 
the 1968 NATO Software Engineering Conference [Sch90]. The conference 
concluded that software engineering should use the philosophies and paradigms of 
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traditional engineering disciplines. Numerous methodologies have been introduced to 
support software engineering. The major approaches which underlie these different 
methodologies are the waterfall model [Lam88], the spiral model [Boe86], and the 
prototyping methods of development [Luq89]. 

B. THE WATERFALL MODEL 

The waterfall model describes a sequential approach to software development as 
shown in Figure 2.1. The requirements are coir^iletely determined before the system is 
designed, implemented and tested. The cost of systems developed using this model is very 
high. Required naodifications that are realized late in the development of a system, such as 
during the testing phase, have a much greater impact on the cost of the system than they 
would have if they had been determined during the requirements analysis stage of 
development. Requirements analysis may be considered the most critical stage of software 
development, since this is when the system is defined. 
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Figure 2.1. The Waterfall Model 














Requirements are often incompletely or erroneously specified, due to the often vast 
difference in the technical backgrounds of the user and the analyst. It is often the case that 
the user understands his application area but does not have the technical background to 
co mmuni cate his needs to the analyst, while the analyst is not familiar enough with the 
application to detect a misunderstanding between himself and the user. The successful 
development of a software system is strictly dependent upon this process. The analyst 
must understand the needs and desires of the user and the performance constraints of the 
intended software system in order to specify a complete and correct software system. 

Requirements specifications are still most widely written using the English 
language, which is an ambiguous and non-specific mode of communication. 

Another difiSculty of the classical life cycle is that communication between a 
software development team and the customer or the system's users is weak. Most of the 
time the customer does not know what he or she wants. In that case it is hard to 
determine the exact requirements, since the software developer is also unfamiliar with the 
problem domain of the system. Formal specification languages are used to formalize 
customer needs to a certain extent Another disadvantage of the classical project life cycle 
is that a working model of the software system is not available until late in the project time 
^an. This may cause two things: 

1) A major bug that remains undetected until the working program is reviewed, 
which can be disastrous [Pre87]; 

2) The customer will not a have an idea of what the system will look like until it is 
complete. 

C THE SPIRAL MODEL 

Large real-time systems and systems which have hard real-time constraints are not 
well supported by traditional software development methods because the designer of this 
type of system would not know if the system can be built with the timing and control 
constraints required until after much time and effon has been spent on implementation. A 
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hard real-time constraint imposes a time-bound on the response time of a process which 
must be satisfied under all operating conditions. 

To solve the problems raised in requirements analysis for large, parallel, 
distributed, real-time, or knowledge-based systems, current research suggests an 
alternative paradigm for software development and evolution based on rapid prototyping 
[LB 88]. The purpose of prototyping is to ensure that proposed requirements and ^stem 
concepts adequately match the needs of the prospective client(s) before detailed 
optimization and in^lementation efforts begin. As a software methodology, rapid 
prototyping provides the user with increasingly refined systems to test and the designer 
with ever better user feedback between each refinement The result is more user 
involvement throughout the development/specification process, and consequently, better 
engineered software. 

The prototyping method shown in Figure 2.2 has recently become popular. “It is a 
method for extracting, presenting, and refining a user's needs by building a working model 
of the ultimate system — quickly and in context” [Boa84]. This approach captures an 
initial set of needs, and quickly implements those needs with the stated intent of iteratively 
expanding and refining them as the user's and designer's understanding of the system 
grows. The prototype is only to be used to model the system's requirements, rather than 
as an operational system [You89]. 
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Figure 2.2. The Prototyping Process 


This iterative prototyping process is also known as the “Spiral Model of Software 
Development” and is illustrated in Figure 2.3. In the prototyping cycle, the system 
designer and the user work together at the beginning of the project to determine the 
critical parts of the proposed system The designer then implements a prototype of the 
system based on these critical requirements by using a prototype description language 
[Luq89]. The resulting system is presented to the user for evaluation. Diuing these 
demonstrations, the user determines whether the prototype behaves as it is sup]K>sed to 
do, examines user interface options, and, most importantly, verifies understanding of the 
problem and solution. If errors arc found at this point, the user and the designer work 
together again on the specified requirements to correct them Concurrently, a risk analysis 
is initiated to decide whether or not to trrove on to the next cjrcle of the spiral This 
process continues until the user determines that the prototype successfully captures the 
critical aspects of the proposed system. This is the point where precision and accuracy are 
obtained for the proposed system The designer then uses the prototype as a basis for 
designing the production software. 
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Some advantages and disadvantages of iterative development methodology are 
listed below: 

Advantages: 

1) There is constant customer involvement (revising requirements). 

2) Software development time is gready reduced. 

3) Methodology maps to reality. 

4) It allows use of off-the-shelf tools. 

Disadvantages: 

1) There are configuration control complexities. 

2) The developer is compelled to manage customer enthusiasm. 

3) There are uncertainties in contracting the iterative development 

ManuaUy construction of the prototype still takes too much time, and can 
introduce many enors. Also, it may not accurately reflect the timing constraints placed 
upon the system. What is needed is an automated method of r£q)idly prototyping a hard 
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real-time system that reflects those constraints and requires minimal development time. 
Such a system should exploit reusable components and validate timing constraints. 

If Ada software that is reliable, affordable, and adaptable is to be produced and 
maintain ed, the characteristics of Ada may not be the only important matter to consider, as 
the characteristics of Ada software development environments may well be oitical 
[BL91]. 

The rapid, itraative construction of prototypes within a computer aided 
environment automates the prototyping method of software development, and is called 
rapid prototyping. Rapid prototyping provides an efficient and precise means to determine 
the requirements for the software system, and greatly improves the likelihood that the 
software system developed from the requirements will be complete, correct, and 
satisfactory to the user. The potential benefits of prototyping depend critically on the 
ability to modify the behavior of the prototype with less effort than that required to modify 
the production software. Computer aided and object-based rapid prototyping provides a 
solution to this problem. 

D. THE COMPUTER AmED PROTOTYPING SYSTEM (CAPS) 

The Computer-Aided Prototyping System (CAPS) [LK88] is a software 
engineering tool for developing prototypes of real-time systems. It is useful for 
requirements analysis, feasibility studies, and the design of large embedded systems. 
CAPS is based on the Prototype System Description Language (PSDL) [LBY88], which 
provides facilities for modeling timing and control constraints within a software system 
An overview of PSDL will be presented in the following section. CAPS is a development 
environment, implemented in the form of an integrated collection of tools, linked together 
by a user-interface, and provides the following kinds of support to the prototype designer: 

• timing feasibility checking via the scheduler, 

• consistency checking and some automated assistance for project planning, 
scheduling, designer task assignment, and project completion date estimation 
via the Evolution (Control System, 
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• design completion via the editors, 

• computer-aided software reuse via the software base. 

A CAPS prototype is initially built as an augmented data flow diagram and a 
corresponding PSDL program. The CAPS data flow diagram and PSDL program are 
augmented with timing and control constraint information, which is used to model the 
functional and real-time aspects of the prototype. The CAPS environment provides all of 
the necessary tools for engineers to quickly develop, analyze, and refine real-time software 
systems. 

The general structure of CAPS is shown in Figure 2.4. The CAPS User-Interface 
provides access to all of the CAPS tools, and facilitates communication between tools 
when necessary. The tools in Figure 2.4 are grouped into four sections: Editors, 
Execution Support, Project Control and Software Base. Each CAPS tool is associated 
with a different aspect of the CAPS prototyping process. 



CAPS is specifically designed to assist and partiaUy automate development efforts 
which lie in the shaded regions of the prototyping process (Figure 2.2). Specifically, based 
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on a set of initial requirements, CAPS allows the engineer to design, modify, demonstrate 
and validate a software system. Through this process, system requirements can be refined 
and modified as necessary. 

The CAPS prototyping process is more specific, and it could be said that it is a 
refinement of what is shown in Figure 2.2, and is outlined below. [Bro94] 

1) Based on requirements, design (or modify) the data flow diagram for the system 

2) Assign all appropriate timin g and control constraints to the protoQrpe operators. 
Assign latencies to data streams (if required) 

3) Assign data types to all data streams 

4) Find (in the software base) or build an implementation module for each user- 
defined data type and each atomic operator. Modules taken firom the software 
base can be modified after retrieval to suit individual needs 

5) Build the prototype's user-interface (if required) 

6) Translate the CAPS-generated (and user-augmented) PSDL program into (a 
portion of) the Ada supervisor module 

7) Run the CAPS scheduler to generate the static and dynamic schedules. This 
conqrletes the prototype's Ada supervisor module 

8) Conqjile the prototype. (Note: for successful compilation, particular attention 
must be paid to the formal parameters of atomic operator implementation 
procedures created in step 4) 

9) Execute, evaluate and modify (if appropriate) the prototype and/or the 
requirements 

10) Retum to Step 1 if prototype naodification is required 

The correlation between these 10 steps and Hgure 2.2 is obvious. Note that the 
basic 10 steps are a bit more detailed than the preceding prototyping process diagram. 
This highlights the real-time requirements, and associated design considerations of typical 
CAPS prototypes. 
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The remainder of this introduction briefly introduces the CAPS tools used to 
perform the basic 10 steps. Note, also, that two of the CAPS tools are outside the 
purview of the prototyping process diagram. These tools perform ancillary functions 
which are not seen in either the prototyping process diagram or the 10 basic CAPS steps. 
These advanced feature tools are the Evolution Control System and the Merger. 

The purpose of the Evolution Control System is to provide automated support for 
coordinating the concurrent efforts of a team of prototype designers, and to manage 
multiple versions of the designs they produce [Bad93]. The purpose of the Merger is to 
combine the effects of two or more enhancements to a prototype that have been 
independently developed [Dam94]. 

CAPS can be executed in either the designer mode or the manager mode. The 
manager mode provides access to CAPS advanced features, including modification of the 
designer pool, creation of project work steps, and prototype change-merging. CAPS 
supports distributed prototype development, and the manager interface provides facilities 
for such efforts. For simple, single-designer prototype building, the designer mode should 
be used. 

1. CAPS Tools 

This section provides a brief description of each CAPS tool. 

a. The PSDL Editor 

The PSDL Editor is the heart of CAPS prototype design. This editor 
% 

consists of 3 separate parts: the Syntax Directed Editor, the Graph Viewer, and the 
Graphic Editor. This tool allows the designer to create the CAPS data flow diagram and 
the PSDL program, and assign all timing and control constraints to prototype components 
(operators and data streams). 

b. The Text Editor 

Although the text editor is not exclusively a CAPS tool, CAPS does 
provide fluid integration of text editing facilities. Designers can select from vi, emacs and 
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the Verdix Ada Syntax Directed Editor (if available) for editing Ada programs. Use the 
“CAPS Defaults” selection under the “CAPS Edit” pull-down menu to make this 
selection. The CAPS User-Interface provides convenient file selection Usts, based on the 
currently selected prototype. 

c. The Interface Editor 

CAPS integrates TAE+ [Tae93] for creation of window-based user- 
interfaces for prototypes. When using the TAE Workbench for creation of such user- 
interfaces, the designer must use the “single file” Ada code generation option fit)m within 
TAE+. The automatically generated TAE code is placed in the prototype directory in a 
file called 

<prototype_name>.RAW_TAE_INTERFACE.a. 

For details about how to integrate this file into a prototype, see Chapter 
Vn of the CAPS Tutorial by Brockett [Bro94]. 

d. The Requirements Editor 

The current version of CAPS does not have a sophisticated requirements 
tracking or editing tool. Single text editor integration is provided for editing 
requirements documents associated with a prototype. CAPS will automatically present 
the user with a list of all files with a “jeq” sufBx when “Requirements” is selected firom 
the “Edit” pull-down menu. After a file is selected, the default text editor will be invoked 
on that file. 

e. The Change Request Editor 

As with requirements, the current version of CAPS docs not have a 
sophisticated change request tracking or editing tool. Simple text editor integration is 
provided for editing change request documents associated with a prototype. CAPS will 
automatically present the user with a list of all files with a “.cf” suffix when “Change 
Request” is selected from the “Edit” pull-down menu. After a file is selected, the default 
text editor will be invoked on that file. 
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/. The Translator 

The CAPS translator converts a PSDL program into compilable Ada 
packages which implement supervisory aspects of the prototype. The translator expects a 
complete PSDL program as input, and creates several packages which make up, in part, 
the supervisor module of the prototype. It is important to note that the translator does not 
create Ada iirq)lementation packages for atomic operators or user-defined data types. 
These must be either extracted from the software base, or custom-made by the designer. 

g. The Scheduler 

The scheduler determines schedule feasibility for CAPS prototypes. 
Information is provided to the scheduler via timing constraints from the prototype’s PSDL 
program. A prototype must be translated before it can be scheduled, and scheduled befOTe 
it can be compiled. Upon scheduling a prototype, CAPS provides schedule diagnostic 
information which can be analyzed and used to direct timing constraint modifications. 

h. The Compiler 

CAPS uses the SunAda Ada compiler. The compilation process is 
completely automated via the “Con 5 )ile” command provided in the “Exec Support” pull¬ 
down menu in the CAPS User-Interface. Successful prototype compilation requires the 
formal parameter lists of atomic operator implementation modules to conform to CAPS 
interface conventions. 

L The E volution Control System 

The CAPS Evolution Control System (ECS) [Bad93] is a system that 
supports distributed prototype development in a team environment The ECS makes use 
of a design database (DDB) for persistent storage of prototype development data. The 
ECS supports maintenance of a designer pool from which to draw for prototype 
development tasks. Within the ECS, prototype development is modeled as a series of 
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steps that are created by the project manager. These steps are automaticaUy scheduled 
and assigned to available designers. 

j. The Merger 

The CAPS Merger [Dam94] provides automated prototype change¬ 
merging. Based on slicing theory, as applied to PSDL programs, the Merger automates 

the co mbinati on of two separate modifications to a base prototype. The Merger detects 
and warns of conflicts between the two changes to be merged. If no conflicts occur, or if 
they are overridden, the Merger creates a PSDL program for the newly created prototype 
which incorporates the changes of each of the modified prototypes. 

k. The Software Base 

The CAPS software base and its associated retrieval mechanism [Dol93] 
provide access to a repository of reusable Ada and PSDL components. The software base 
allows a designer to browse as well as query its components. Queries to the software base 
can be in the form of keywords or PSDL specifications. In the current release of CAPS, 
the software base matching mechanism is based on parameter matching. 

E. THE PROTOTYPING SYSTEM DESIGN LANGUAGE (PSDL) 

PSDL is a partially graphical specification language developed for designing real¬ 
time systems. It has several facilities for modeling timing and control constraints, but is 
also useful for requirements analysis and feasibility studies. It was designed as a 
prototyping language specifically for CAPS, to provide the designer with a simple way to 
specify software systems [LBY88]. PSDL places strong emphasis on modularity, 
simplicity, reuse, adaptability, abstraction, and requirements tracing. 

A PSDL prototype is built as an hierarchical structure of components, gr^hically 
represented as data flow diagrams, and augmented with timing and control information. 
Each component may contain zero or more definitions for OPERATORS and TYPES, 
where each definition has two parts: 
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• Specification part. Defines the external interfaces of the operator or the 
type through a series of interface declarations, provides tuning constraints, and describes 
functionality by using mformal descriptions and axioms. 

• Implementation part: Denotes what the implementation of the component 
is going to be, either in Ada or PSDL. Ada in^lementations point to Ada modules, which 
provide the functionality required by the component's specification. PSDL 
implementations are data flow diagrams augmented with a set of data stream definitions 
and a set of control constraints. 

1. PSDL Computational Model 

PSDL is based on a con^utational model containing OPERATORS that 
communicate via DATA STREAMS, where each stream carries values of a fixed abstract 
data type. There are several ADTs already built into PSDL; the PSDL_EXCEPTION is 
one of them. Modularity is supported through the use of indq>endent operators that can 
only gain access to other operators when they are connected via data streams. 

The PSDL computational model is formally represented as an augmented graph 
[LBY88] 

G = (V^,T(v),C(v)) 

where: 

• V is a set of vertices 

• E is a set of edges 

• T(v) is the set of timing constraints for each venex v 

• Ov; is the set of control constraints for each venex v 

Each vertex represents an operator and each edge represents a data stream. 

a. Operators 

An operator represents either a function or a state machine. When it fires, 
an operator reads one data object fix)m each of its input data streams and writes at most 
one data object on each of its output streams. If the output depends only on the current 
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set of input values, then the operator represents a function. In other words, the same 
response is given each time they are triggered. If, on the other hand, the output of the 
operator depends upon the input values and on internal state values representing some part 
of the history of the computation, then the operator represents a state machine. 

A PSDL operator can be either atomic or composite. Operators that are 
decomposed into lower levels are called composite operators, and they represent networks 
of components. This decon^sition is always functional An operator that is not 
decomposed is called atomic, and in the current version of CAPS, they are implemented in 
Ada, but any language could be used for that purpose. According to the PSDL grammar, 
it is in the irrplementation part of the operator that we can declare an operator to be 
atomic or composite. 

b. Data Streams 

Data streams represent sequential data flow mechanisms which move data 
between operators. There are two kinds of data streams; sampled streams and data flow 
streams. 

In PSDL the data trigger of a consumer operator determines the type of a 
data stream. If the stream is declared in the ‘TRIGGERED BY ALL” clause of the 
consumer operator, then the stream is a data flow stream. In all other cases it is a sampled 
stream. 

Data-flow streams in the current implementation are similar to FIFO 
queues with a length of one. Any value placed into the queue must be read by another 
operator before any other data value may be placed into the queue, or it will overflow. 
Values read from the queue are removed from the queue, and if any attempt is made to 
read from an empty queue, it wiU underflow. Sampled data streams may be considered as 
a programming variable which may be written to or read from at any time and as often as 
desired. A value is on the stream until it is replaced by another value. Some values may 
never be read, because they are replaced before the stream is sampled. As can be seen, 
care must be taken when reading values from uninitialized sampled streams. All PSDL 
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data streams contain, at most, one data item at any given time. In summary, it could be 
said that a data flow stream guarantees that none of the data values are lost or replicated, 
while a sampled stream does not make such a guarantee. 

c. State Streams 

A CAPS prototype is a well-formed PSDL program if its graph 
representation (excluding all state streams) is a directed acyclic graph (DAG). This 
restriction may not seem to make sense at first glance. However, when a prototype graph 
contains a cycle, this indicates the presence of state information, and states must be 
explicitly declared and initialized. PSDL fully supports the integration of states in its 
prototypes. 

When a state is introduced into an atomic operator, it must be implemented 
within the Ada code for that operator, and shouldn't appear in the graph as a self loop 
state edge. 

d. Types 

PSDL user-defined data types are abstract data types (ADTs) which can be 
used in CAPS prototypes. PSDL types, like PSDL operators, can be implemented in 
either PSDL or Ada. Types can be associated with a set of operators. Types implemented 
in Ada are realized by an Ada package that defines a private type and a subprogram for 
each operator on that type. 

e. Exceptions 

Exceptions in PSDL are values that can be transmitted on data streams of 
the type “PSDL_EXCEPTION”. During prototype execution, undeclared exceptions are 
transformed into PSDL exceptions of the type PSDL.EXCEPTION, which is a subtype of 
UNDECLARED_ADA_EXCEPT10N. Exceptions can also be raised by explicitly 
declaring them in the control constraints part of the PSDL program for the prototype. 
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/. Timers 

PSDL timers arc software stopwatches that arc used to record the length of 
time between events, or to control the duration the system spends in some particular state. 
They are declared in the inplementation part of a root operator, and are governed by the 
control constraints “START TIMER”, “STOP TIMER” and “RESET TIMER”. 

2. Control Abstractions 

As a major property of real-time systems, periodic execution, as well as other 
timing related attributes, is supported eiqilicidy. The order of execution is only partially 
specified, and is determined fiom the data flow relations given in the enhanced data flow 
diagr ams , but also affected by die types of data triggers among operators. 

There are several control aspects to be specified, such as whether the operator is 
periodic or sporadic, the triggering conditions, and the output guards. 

a. Periodic and Sporadic Operators 

PSDL supports both periodic and sporadic operators. Periodic operators 
are triggered by the scheduler at tqiproximately regular time intervals, so that they start 
execution somewhere after the beginning of the period, and complete by some deadline, 
which defaults to the end of the period. Sporadic operators arc triggered by the arrival of 
new data, and possibly at irregular time intervals. 

b. Data Triggers 

Any PSDL operator can have a data trigger, of which there arc two kinds, 
as illustrated by the following examples: 

OPERATOR P TRIGGERED BY ALL X, Y, Z 

OPERATOR Q TRIGGERED BY SOME A, B 

In the first exanqile, the operator P is ready to fire whenever new data 
values have arrived on all three streams X, Y and Z (triggering set), although there may be 
other streams coming into the operator P, in which case the data values do not need to be 
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new. This niesns that the data streams associated with X, Y and Z are data flow streams. 
This kind of trigger should be used when the items in a stream represent discrete events 
(e.g., transactions on a bank account) rather than samples fiom a continuous source of 
data (e.g., a temperature sensor). This kind of trigger also ensures that the output of the 
operator is always based on fresh data for all of the inputs in the triggering set 

Hie most important design consideration when “BY ALL” triggers are 
used is management of the firing frequencies of the producing and consuming operators. 
The period of the consuming operator must be smaller or equal to the period of the 
producing operator, or stream buffCT overflow errors will result (i.e., the consuming 
operator must fire at least as often as the producing operator). This is because the 
streams in CAPS can hold a m a x imum of one data item. CAPS ensures that if the 
consuming operator’s period is less than that of the producing operator, the actual firing 
rate of the two will be the same (i.e., “BY ALL” trigger data streams are tested for new 
information prior to the actual firing of the cons umin g operator). 

In the second example, the operator Q is ready to fire whenever new data 
arrives on at least one of the inputs A or B. This kind of activation condition guarantees 
that the output of operator Q is based on the most recent data fiom at least one of its 
critical inputs A and B, mentioned after the TRIGGERED BY SOME clause. This is also 
a very constrained condition, since the scheduler must guarantee that a new data in A or B 
wiU not be lost 

If a periodic operator has a data trigger, the operator is conditionally 
executed with the data trigger serving as input guard. 

If a data trigger is not satisfied, the values are not read and, consequently, 
not consumed fiom any of the input streams. 

c. Execution Guards 

The firing of a PSDL operator can be regulated by an execution guard. 
Executiwi guards are conditional statements which are evaluated prior to firing the 
associated operator. Execution guards can depend on data from any incoming data stream 
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and they can be combined with the “BY ALL” and “BY SOME” data triggers mentioned 
above. Even if an execution guard is not satisfied, the values are read and consumed from 
all the input streams, without firing the operator. Examples are: 

OPERATOR R TRIGGERED BY SOME X, Y IF X > 20.0 

OPERATOR S TRIGGERED IF X: EXCEPTION 

d. Conditional Output 

PSDL conditional output is mq>lemented in CAPS as guarded execution of 
code that writes values to data streams. Conditional output does not affect the firing of an 
operator, which will fire in accordance with the CAPS schedule regardless of whether or 
not its output is written to an output data stream. The condition of an output guard may 
depend on the output values of the operator, on the values read from the input streams, 
and on the values of timers. 

3. Timing Constraints 

Operators can be time-aitical or non time-critical, depending on whether or not 
they are assigned a value for the maximum execution time (MET) by the designer. If 
time-critical, they can be further subdivided into periodic or sporadic operators. Periodic 
operators are e^licidy assigned a fiequency (PERIOD) of execution, meaning that they 
wfll fire within regular periods, exactly once, but not necessarily at regular intervals of 
time. Sporadic operators are not e}q)licitly assigned a period, but they fire whenever there 
is new data on a set of input data streams, having, however, a minimum interval of Hma 
between successive firings. Periodic operators can also be triggered by the arrival of data. 
However, this trigger wfll behave like a condition to be checked during periodic firing. 
Every sporadic operator has an MRT and MC3* in addition to an MET. 

Timing constraints are an essential part of specifying real-time systems, and in 
PSDL the following timing constraints are supported: 

• Maximum Execution Time (MET) 

• Period (PER) 
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• Finish Within (FW) 

• Maximum Response Time (MRT) 

• Minimum Calling Period (MCP) 

• Latency (LAT) 

• Mi nimum Output Period (MOP) 

The MET reflects the amount of CPU time that an operator may use for execution, 
and is t^plicable to both periodic and sporadic operators. Note that for atomic operators 
the MET complies with the above definition. For the composite operator, however, the 
MET is the maximum CPU time needed along any thread of control. Within CAPS, the 
MET is assumed to account for the following: data triggering checks, stream reads, 
execution guards checks, the execution itself, output guards checks, stream writes, and 
exception handling. 

This parameter is by itself one of the most difficult to quantify. It is, therefore, 
unfortunate that it is also one of the most important parameters employed during the 
scheduling process. Two alternatives can be taken: to use the worst-case execution times, 
which can result in a poor processor utilization, or to use some value smaller than the 
worst-case, which introduces the possibility of an overload. For reasons of safety, CAPS 
uses the first approach by defining the MET as an upper-bound on the execution time. 
For further reading about execution time issues refer to Leinbaugh [LciSO, LY82] and 
Mok [Mok83]. 

Actually, due to the critical nature of the systems that CAPS was intended to 
prototype, the worst-case approach has been used throughout its design. This approach is 
observable even in the scheduling model, where the non-preemption option was chosen. 
This is because, while it is true that if a non-preemptive schedule can be devised for a set 
of tasks, then, it is possible to devise a preemptive one, but the opposite is not always true 
[Bla76). 
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The MRT defines an upper-bound on the time between the arrival of new data that 
satisfies all data triggering conditions of a sporadic operator and the time when the last 
value is written onto the output stream. The MRT applies only to sporadic operators. 

The MCP also applies only to sporadic operators, and represents a lower-bound on 
the timp. between two consecutive triggCTings of a sporadic operator. It constrains the 
behavior of the producers of the triggering data values, rather than constraining the 
behavior of the operator itself. Both timing constraints are illustrated in Figure 2.5. 

As shall be seen later, each sporadic operator is going to be converted into an 
equivalent periodic one, whose period is called the triggering period (TP). 

Scheduling delay for a sporadic operator is the interval of time between the writing 
into an output data stream by the producer and the corresponding reading of the input 
values by the consumer. 



Figure 2.5. Sporadic Timing Constraints 


Periodic operators are triggered by temporal events which must occur at regular 
intervals. For each operator, these activation times are deteraiined by the specified period 
(PER), which is the time interval between two successive activations. The penod applies 
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only to periodic operators. Note, however, that there is a distinction between activation 
time and the actual start time of a periodic operator as shown in Figure 2.6. 



Finish within (FW) defines an upper bound on the finish time for a periodic 
operator. The difference between the activation time and its deadline is called the 
scheduling interval (SI) and it is equal to FW. 

Scheduling intervals of a periodic operator can be viewed as fixed windows of a 
size equal to FW, evenly separated by the period PER, and whose absolute position on the 
time axis is determined by the stan time t of its first execution. For the first instance this 
time may vary within the closed interval [04>ER] of the operator, and is called the phase 
of the operator (Figure 2.7). Scheduling intervals for sporadic operators will be covered 
in the next chapter, after we discuss how to deal with this type of operator. 
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Figure 2.7. The Scheduling Interval 

The difference between FW and MET is called the slack of the operator. Table 
2.1 summarizes the timing constraints for periodic and sporadic operators. 



Table 2.1. Main PSDL Timin g Constraints 

To express the behavior of distributed systems, PSDL provides two timing 
constraints. Latency (LAT) and the Minimum Output Period (MOP). The latency of a 
stream is an upper-bound on the duration of the time interval between the instant a data 
value is written into a stream and the instant that data value becomes available for reading 
from the stream. In other words, the latency attribute for a stream is meant to specify an 
upper-bound on the allowable time spent by that stream in the network. This information 
should be used by the scheduler to simulate the worst case behavior for the delay in the 
network. Note, however, that this attribute does not expUcitly require that the data 








carried by the stream should be consumed, within the time interval, by the consumer 
operator on the other side of the network. The notation LAT,^, will be used to denote the 
latency associated with the stream between operators T* and Ty. 

The minimiun output period is a lower-bound on the duration of the interval 
between two successive write events on the stream. In the absence of explicit 
synchronization, both the latency and minimum output period of a stream have the default 
value of zero (no delay, imbounded data rate). The purpose of these additional constraints 
is to declare communication constraints that arise from hardware limitations imposed by 
external constraints on how the software functions must be allocated to different physical 
nodes of a distributed systera Explicit modeling of these constraints is also sometimes 
required to ensure feasibility, because latency affects calculations of time budgets, as well 
as maximum execution times for composite operators. The effect of these constraints on 
static scheduling is that data cannot be read from a stream until a delay equal to the 
latency has elapsed, and that data cannot be written into a stream until the minimum 
period has elapsed. 

4. A PSDL Prototype Example 

Figure 2.8 shows a simple autopilot system that illustrates some of the typical 
features of PSDL. The example has a minimal specification pan with an informal 
description. The implementation pan contains a graph, making the operator Autopilot a 
“composite” operator. The figure also indicates maximum execution times, 170 ms for 
operator display, 50 ms for operators compass and altimeter, and 75 ms for the remaining 
operators. All operators are periodic with a period of 500 ms, except for the operator 
controLsurfaccs, which is sporadic, with an MRT and MCP of 900 ms, as it is shown in 
the control constraints pan of the PSDL program. 

Concluding, it can be said that the operator controLsurfaccs vrill be triggered 
whenever there is new data in cither the course_command or the altitudc_command 
streams. The operators correct_altitudc and correa.course will be triggered whenever 
there is new data in the actual_altitudc and actual_course streams, respectively. 
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OPERATOR nnopilat 
SPECmCATION 

STATES ddu.oouac: INTEGER INTIIAIXY 0 

STATES dclii^illiludc: INTEGER INTITALLY 0 

STATES defired.oouzBc: INTEGER INITIALLY 0 

STATES fkciied.altitiide: INTEGER INITIALLY 0 

END 

IMPLEMENTATION 

GRAra 



Hevator^sUtus 


DATA STREAM 

•cQul.Altinide: INTEGER. 

•caiil^cxuiBe: INTEGER, 

lIli lu d c _CO mfn « n d: 
o nmir ^c nrwn i n d: a mi 

devator.gu&is: ctevaior.iutm.type. 

xudda.Batui: nidder.niiia type 

CX>NTROL CONSTRAINTS 
OPERATOR ateweer 
PERIOD SOO MS 
OPERATOR ocrapui 
PERIOD 500 MS 

OPERATOR oaotnl^fudeoee TRIGGERED BY SOME oouae ccnnnani, aloiude 
MAXIMUM RESPONSE TIME 900 MS 
MDGMUM CALLING PERIOD 900 MS 
OPERATOR oocraa thiiude TRIGGERED BY ALL ac&ul 
PERIOD 500 MS 

OPERATOR oami eouae TRIGGERED BY ALL aGDiaLcoiiae 
PERIOD 500 MS 
OPERATOR dis^y 
PERIOD 500 MS 

END _ 

Hgure 2.8. Prototype of an Autopilot 
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m. FUNDAMENTAL ISSUES IN REAL-TIME SCHEDULING 


A. THE SCHEDULING MODEL AND SOME DEFINITIONS 

An instance of a prototype T can be thought of as the union of three disjoint finite 
sets, namely the set P of periodic operators, the set S of sporadic operators and the set N 
of non-time critical operators. Within CAPS, each periodic operator can be described, for 
scheduling purposes, as a three-tuple (MET*, PERx, FWx), where METx is the m a x i mum 
execution time used by each instance of operator X, PER* is its period and FWx is the 
length of its scheduling interval. Likewise, each sporadic operator can be described as a 
three-tuple (METx, MCPx, MRTx )**’, where MCPx is the minimum period between two 
consecutive instances of operator X, and MRTx is the upper bound on the time between 
the triggering of operator X by some new data arrival, and the completion of writing to all 
of its output streams. The superscript SP is used in the sporadic case, only to distinguish 
from the three-tuple of the periodic operator. Given any static schedule for a prototype T, 
we shall use Su, fu and du to denote the actual starting time, completion time and deadline 
of the i* instance of operator X in the schedule. In any feasible schedule, we must have 

0 < Six < PERx 
and 

dix = Six + (i-l)x PERx + FWx Eq. (1) 

for every periodic operator X, where Su is called the phase of operator X as defined in 
Chapter H. Note also from Eq. 1 that the deadline for first instance of any operator is 
calculated relative to its start lime rather than from time zero*. This condition will release 
the scheduler ftom enforcing the condition that the first instance of operator X should 
firush the time PER*. Whenever possible, it is going to be used the letters X and Y to 
denote operators, leaving the letters i and j to denote their corresponding instances. 


*Time zero is defined as the time when prototype starts execution. In reality it is the start time of the 
first operator according to the topological sort 
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Since, in general, the release time does not affect the complexity of the scheduling 
problem [Mok83], it will be assumed that aU first instances are released at time zero, but 
may be constrained by the precedence relationship between the operators, if one exists. 

By definition, every periodic operator must start and finish execution within its 
period of activation. 

The following restriction is also imposed on the model, where the Tnaximnm 
execution time must be smaller or equal to the finish-within, which in turn must be smaller 
or equal to the period; 

MET^FW<PER 

Qearly, the first inequality is needed, otherwise there is no way to execute such an 
operator within the specified amount of time (FW). 

One may want to argue that there is a need to relax the second inequality to PER < 
MET < FW. Since PER < MET, such processor demand can only be satisfied using 
pipelining in a multiprocessor environment [Luq93, LSB93], which will be discussed in the 
next section. 

Note that for the sporadic operator all of the above assumptions are also 
applicable, since they will be converted into equivalent periodic operators, as can be seen 
later in this chapter. 

The Harmonic Block (HB) of a periodic task set P is the least common multiple 
(LQVl) of all the periods in P. It is the interval upon which the task set vidll be tested for 
schedulability. If a feasible schedule can be found within 2xHB, in the case where 
latencies are not allowed in the schedule, or in at most 3xLCM if latencies are allowed, 
then it is possible to say that the same pattern can be repeated forever. This topic will be 
further discussed in Section C 

A prototype T is said to be schedulable if there exists a schedule such that the 
completion time for the execution of instance i of operator X (fix) is less than or equal to 
its corresponding deadline du, for all i and X, and the precedence constraints of the 
prototype T are satisfied. 
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The precedence constraint between operators X and Y, written as X < Y, where < 
denotes a partial ordering on the execution of tasks X and Y, is satisfied if 

V instances i j (i-1) x PER* + Su < (j-l) x PERy + Siy 

and 

(j-l)xPERy +Siy + A < i X PERx + Si* 

where (i-1) x PER, = (j-l) x PERy * and A equal the maximum time to read input 
operator Y. 

Operators fiom either the periodic set P or fix>m the sporadic set S are non- 
preemptable, which means that once they start execution they will run to completion. The 
only operators that can be preempted are those belonging to the set N. 

No idle time is inserted into the static schedule, unless there are no operators ready 
to execute. 

AH timin g information is assumed to be an integral multiple of a basic unit of time, 
which within CAPS is assumed to be the millisecond. Table 3.1 presents a summary of the 
major assumptions of the scheduling model. 

_ For all periodic operators MET ^ FW < PER _ 

All time-critical operators are non-preemptable 

_ Time is discrete _ 

A periodic operator is completely specified by the tuple 

_ (MET. PER, FW) _ 

A sporadic operator is completely specified by the tuple 

_ (MET.MCP.MRT)" _ 

_ Static Scheduling is assumed _ 

Table 3.1. Summary of our Scheduling Model 

In the next section, a series of theorems on schedulability for a set of independent 
non-preemptive periodic task sets will be presented. They will provide the necessary 
background to build a firamework upon which the later sections of this chapter will be 
based. 


^ This condition will be relaxed after we present our new synchronization model in Chapter IV. 
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B. CONDITIONS FOR SCHEDULABILITY OF NON-PREEMPTIVE TASKS 

In this section, a series of schedulability checks are introduced for a periodic task 
set P that has no precedence constraints. These results will be also applied to a set of 
periodic tasks with precedence constraints in Section D of this chapter. 

1. The Maximum Execution Time Theorem 

When dealing with non-preemptive uniprocessor static scheduling a gnfRri<»nt 
condition for unfeasibility occurs whenever a task requires more computation timp. than 
the period of any other task, or more specifically, more than the minimiiTn period among 
all tasks. Formally: 

Theorem 1: 

‘Tor an independent periodic task set P, if 3 some tasks X and Y e P, such that 
MET, ^ERy then P is not schedulable in the uniprocessor case by any non-preemptive 
algorithm. Furthermore, if X = Y then neither the preemptive nor the non-preemptive 
algorithms can find a feasible schedule.” 

Proof: 

Qearly, whenever task X executes, task Y, which happens to have a smaller 
period, will be blocked for an interval of time bigger than its period, which is contradictory 
with the definition of a periodic tack q 

Note that the Theorem still holds if precedence relationship exists among the tacks 
in P. This same result is also valid for a sporadic task set when MET, > MQ’y for X = Y 
(trivial case). However, for X Y the situation is slightly more complex, and there are 
two cases to consider. The first is when MRTy < MCPy, and it is clearly not schedulable. 
The second case is when MRTy t MCPy, and the set is not schedulable if MET, + METy > 
MRTy, as shown in Figure 3.1. 
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Corollary: (for the distributed case) 

‘Tor an indq)endent periodic task set P, if 3 some tasks X and Y e P, such that 
METx SPERy, then in order for P to be schedulable in the multiprocessor case, tasks X 
and Y must be placed in different processors, and if X = Y, then it must be pipelined. □ 

The conditions imposed on a task X for it to be pipelineable as well as a detailed 
description of pipelining in this context, can be found in the work of Luqi [Luq93] and 
Luqi, Shing and Brockett [LSB93]. 

There are two ways to handle pipelining. The first is to use task migration at run¬ 
time, which involves sending a copy of the code and data to be executed in the other 

pnxxssor. This presents the following problems: 

1) It increases the context swtching overhead, with direct impact on the timing 

constraints 

2) There is a need to create an additional task to handle the dispatching of tasks 

3) It is not well suited for static scheduling 






The second approach is to replace the tasks to be pipelined in the other processors 
in a pre-processing step. For exan^le, consider a periodic operator OPa( 150,1(X),150) 
with inputs Dl, D2 and output D3 as shown in Figure 3.2. As shown in Figure 3.2b, we 
can replace operator OPa with two identical operators, OPb( 150,200,150) and 
OPc(150,200,150), with twice the original period and a state stream syn, whose latency 
equals the time taken by the non-overlappable segment of the code implementing operator 
OPa. The operators OPb and OPc will be triggered alternately on the value of syn. 



The replication of tasks throughout the system presents the following problems: 

1) It increases the memory requirements for the processors 

2) It demands highly sophisticated mechanisms for implementing tight 
synchronized schedules among the processors, which restricts this approach to 
the shared memory models with a global clock 

Both of the above discussed methods, however, suffer from the very serious 
|iroblem of having to quantify the timing parameters of the segments of code that cannot 
be overlapped, which is by itself one of the hardest ones. If those timing parameters could 
be known in advance, then the operator could be separated into independent parts, and 
pipelining would not be needed. 
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The validity of pipelining in a hard real-time environment is therefore questionable, 
and, furthermore, it is impossible to implement in a distributed system where there is no 
inexpensive method by which to assure tight synchronization among tasks. 

2. The Finish-Within Theorem 

Theorem 2: 

*Tor an independent periodic task set P if B some indivisible task X e P such that 
MET* > FWx then P is not schedulable under any scheduling algorithm, not even in a 
multiprocessor environmenL” 

Proof: 

Clearly, if MET* > FW*, the only way to handle this case is if we could split task X 
into two or more data independent partitions, so that they could run in parallel on different 
processors, but, as stated in the theorem, X is indivisible. □ 

Note that this theorem can be easily extended to cover the sporadic case when 
MET* > MRT*. It is also applicable to the case where we have precedence constraints in 
the set P. 

3. The Minimum Period Theorems 

In the other extreme of Theorem 1, there is a sufficient but not necessary condition 
to guarantee schedulability of an independent periodic task set, as stated in Theorem 3: 

Theorem 3: 

‘Tor a periodic task set P, if V tasks X € P, FW* ^ PER* and X MET* < PER* 

x=l 

where PER, denotes the minimum period in P, then P is schedulable.” ’ 

Proof: 

The minimum period is certainly a divisor of the least common multiple of the 
periods (LCM), and, as such, it can span the entire LCM within an integral number of 


section, 


'similar result was achieved independently by Zhu. et al. [ZLC94] using 
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steps. It is a kind of sliding bin-packing where a sliding window of size equal to the 
minimum period is present and, always large enough to fit all tasks present in that window. 
Of course, depending on the periods, all instances may not be active simultaneously in that 
specific window. However, in the event that it does happen, the instances will always fit 
in there. □ 

As shall be seen lata*, diis theorem is valid even when precedence constraints are 
taken into consideration. 
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Hguie 3.3. The Minimum Period Sliding Window 


It is possible to use a counter example to show that the above condition is a 
sufficient but not necessary condition. Consider two periodic tasks with the following 
timing constraints: (5,10,10) and (2.5,5,5). The sum of METs is bigger than the minimum 
period, but this task set is still schedulable. 

What happens if all deadlines are restricted to be less than or equal to their 
corresponding periods? In this case it could be said that Theorem 3 is not ^plicable, as 
illustrated by the following example: (3,5,3), (1,10,3). 
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Theorem 4: 

“For a periodic task set P, if V tasks X € P, X MET* ^ FW*, where FWz denotes 

X=1 

the minimum FW in P, then P is schedulable.” 

Proof: 

The same idea of sliding bin-packing tq)plies here. Now, however, the size of the 
bin must be decreased. In other words, the *‘bm” now should be understood to be the 
least value among all periods and FW, among the tasks from P. □ 

The next theorem to be presented is the Load Factor Theorem, which is very well 
known in the field of scheduling. It defines a necessary condition for the schedulability of 
a periodic task set, and it basically stipulates that if the summation of all individual load 
factors (MET,/PER0 is bigger than the number of available processors, then the set is not 
schedulable [LL73]. 

4. The Load Factor Theorem 

Theorem 5: 

n MFT 

‘Tor a periodic task set P, if X ■ * > k, where k is the number of available 

x=l PERx 

processors, then the set is not schedulable.” 

Proof: 

A very simple proof is given independently by Zhu [ZLC94] and Jeffay [JSM91] 
for the case where k equals 1. Baacally, if both sides of the inequality are multiplied by 
the least common multiple (LCM) of their periods, it does not affect the inequality, but 
now 

n LCM 

X METxX-^^>LCM Eq.(2) 

x=l i^cKx 

Qearly, the ratio LCM/PER* defines an integer that represents the number of 
instances for each task X within the LCM. If the number of instances of each task is 
multiplied by its maximum execution time and the results are then added, the result is the 



total computation time needed by the entire task set. According to Eq. 2, however, the 
total computation time needed is bigger than the LCM. In other words, even if all 
instances are executed one after another, they would not be able to finish within LCM. 
The case for k greater than one follows automatically. □ 

It should also be clear fix)m the proof of Theorem 5 that it is valid to both 
preen:q)tive and non-preemptive algorithms [ZLC94]. 

5. The Task Demand Theorem 


The following theorem is based upon the previous work of Jeffay, et al. [JSM91] 
which established necessary and sufficient conditions for schedulability of an independent 
periodic task set in a non-preemptable uniprocessor environment The theorem to be 
introduced next is an adaptation for the scheduling model used in this dissertation. It 
differs fix)m the original theorem in that Jeffay’s model accounts for, tasks that are 
independent, there was no explicit deadline for the tasks other than their own period, and 
his definition for a schedulable set of tasks required that both conditions in the theorem 
should be valid for every concrete task set generated from P, where a concrete task set can 
be viewed as the original independent periodic task set P with si)ecific release times for the 
first instance of every operator in P. 

TTie inclusion of the deadline which differs from the corresponding period into the 
problem made it a lot more complex, since tasks can now finish as early as their MET. 
The new results are presented in the following theorems: 

Theorem 6: 

“For an independent periodic task set P, where the tasks are sorted in non- 
decreasing order by finish-within (i.e., for any pair of tasks X and Y, if X < Y. then FW, ^ 
FWy), if there exists a feasible schedule for every concrete task set in P, then the following 
condidons hold; ** 


f METx 
xti PERx 


^ 1 . 
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2 ) 


Vx,l<x<n; Vk,0^k<^^; 


2N(y,k X PERx H-FW^) x METy < kxPER^ + FW^ 
y=l 

3)Vx,l<x<n; VL,FWi<L<FWx; 

L ^ MET, + N(y, L-l)xMETy 
y=l 


where 


N(y.L) 



L 


PERy 


L 


-1 

>* 

_1 


if Lmod PERy < FWy 
if Lmod PERy S FWy 


and LCM is the least common multiple of all the periods of the periodic task set 
Proof: 

Condition 1) is basically Theorem 5 for the uniprocessor case. Conditions 2) and 
3) together say that for the set to be schedulable, the processor demand in the interval 
[OJ-] (i.e., the sum of computation times from all instances that must finish in the interval 
[OJ-]), must always be less than or equal to the length of L. As in Jeffay’s woric [JSM91], 
the contrapositive of Conditions 2) and 3) will be proven. To prove the contrapositive of 
Condition 2), consider a concrete set of periodic tasks {Ti, T 2 ,.... Tn) where for 1 < X ^ 
n, the release time of the first instance of Task T, = 0. Then, for every X, 1 ^ X ^ n, and 

every k, 0 < k < , the processor demand, do*xPER -fFw , fi:x)m all task instances that 

PbKx » * 

must finish in the interval [0, kxPER,+FWJ is given by 

dojodTO ■♦rw — XN(y,k X PERjj + FWj[) X METy 
* * y=l 


So if Condition 2) does not hold, then there exist an X and a k such that 
do,tod»ERj^ 4 Fw^ > kxPERx+FW, and P has an unschedulable concrete set 

To prove the contrapositive of Condition 3), consider a concrete set of periodic 
tasks {Ti, T 2 ,..., To) where for some task T,, the release time of its first instance is T, = 
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0, and for all Y X, the release time of the first instance of task Ty = 1, as shown in 
Figure 3.4. 



Since neither preemption nor inserted idle time are allowed, the first instance of 
task Tx must execute in the interval [0,METx]. For all L, FWi < L < FWx, in the interval 
[0,L] the processor demand dox, from all task instances that must finish by time L, is given 
by 

x-1 

dox = MET,+ I N(ya^l)xMETy 

y=l 


So, if Condition 3) does not hold, then dox > L. and P has an unschedulable concrete setD 
Note also that the function N(yJ-) can also be expressed in closed form as follows: 


N(y.L) = 


PER 


yj 


+ imn 


f 



\ 


L 


1 




* 1 


FW + ^ 

^ y^ PERv 

X PERy 


< 

L ^ • 

• 

> 
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The left hand side of the addition operator specifies how many full periods there 
exist for task y within L, while the right hand side specifies whether the re mainin g fraction 
of a whole period is large enough for a scheduling interval (i.e., FWy) of task Y. The 
minimum comes into play because if FWy < L/2 < PERy , it would contribute more than 
once for the processor demand in the first period, which cannot occur. 

As an example consider the task set Ti( 8 ,45,20), T2(9,40,30), and T 3 ( 10 , 100 , 100 ), 
already sorted by FW. 

Clearly, n = 3 and the interval of interest is 20 < L < 100. 


Let i = 1, then L = 20, which is the trivial case. 

Let i = 2, then 20 < L < 30 

for 20 < L < 30, L must be > 9 + 8 0 

Let i = 3, then 20 < L < 100 

for 20 < L < 30, L must be > 10 + 8 0 

for 30 ^L< 65, Lmust be^ 10 + 8+ 9 0 

for 65 ^L< 70, Lmustbe^ 10 + 8 + 8+9 0 

for70^L< 100, Lmustbe^ 10 + 8 + 8 + 9 + 9 0 


If the task set was not approved in all conditions, it could be said that there exist at 
least one concrete task, that could not be scheduled. Alternatively, if all conditions were 
satisfied, then nothing else could be stated before Theorem 7 is introduced. 

Theorem 7: 

“If an independent periodic task set P is schedulable according to Theorem 6 , then 
the non-preemptive Earliest Deadline First (EDF) algorithm will be able to find a feasible 
schedule forP.” 

Proof: 

As in Jeffay’s woric [JSM91] this theorem shall be proved by contradiction. 
Assume that a task in P misses a deadline at some point in time when P is scheduled by the 
EDF algorithm. Let tj be the earliest point in time at which a deadline is missed. All 
instances of P can be partitioned into three disjoint sets Si, S 2 and S 3 where: 
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51 is the set of task instances with a deadline at td; 

5 2 is the set of task instances with an invocation before t<i and deadlines after td, 
and 

5 3 is the set of task instances not in Si or S 2 . 

Let to be the end of the last period prior to td , in which the processor was idle. If 
the processor has never been idle, then to = 0. Since neither preemption, nor inserted idle 
time are allowed, all task instances which are executed in the interval [to, td] must be 
activated at or after to . Dqjending on whetiiCT the interval [to, td] contains any task fiom 
the set S 2 , the following two cases exist: 

Case 1: None of the tasks in S 2 are scheduled in the interval [to, td]. 

This case only happens if to = 0. Otherwise, we either have an instance that misses 
its deadline in the interval [0, to] if to *0 > td - to , or the processor has an idling period in 
the interval [to , td], if to-0 ^ td - to. Furthermore, td ^ LCM. Otherwise, we must have 
another instance that misses its deadline prior to td. 

Let Tj, be the task instance that misses the deadline at time td. Then, td - 0 = 

kxPER,+FW, for some k, 0 ^ k < The processor demand, do*xPER^.^Fw^, &om aU 

instances which must finish in the interval [0, kxPERx+FWJ equals 

iN(y,k X PER;^ + FW.) x METv 

y=l A » y 

and it is greater than kxPERx+FW*, a contradiction. 

Case 2: Some of the task instances of S 2 are scheduled to run in the interval [to, td]. 

Let Tix be the last instance in S 2 scheduled to run prior to td in the interval [to, td] 
and let tu be the starting time of To. The invocation time of all task instances scheduled to 
Stan in the interval [to+l, td] must be at or after to+l and with deadline at or before td, 
otherwise the EDF algorithm will not schedule Tj* to stan at t^. Hence, the process 
demand for the interval [to, td], t^tu.td»tnust be bounded fixim above by the inequality 
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x-1 


dtj^,td ^ METx + Zj N(y, td - (tix+D) X METy 


Since there is no idle time in [tu> t<i], and since a task missed a deadline at t<i, it 
follows that ' h*. 

LetL = td-tix. Then 

FW,<L<FWx 

and 


x-1 

L < dtj^^td - N(y, L-l)xMETy 


contradicting condition 3 of Theorem 6. □ 

Note that Condition 3 in Theorem 6 is a sufficient but not necessary condition for 
schedulability of a particular concrete task set, as illustrated by the following example. 
Consider the task set Ti(l(X),150,150) and T 2 (l00,300,200). Qearly it does not satisfy 
Condition 2, a feasible schedule may still be found if their release times are zero. 
However, if the release time of T 2 is changed by only one unit of time, then the set is no 
longer schedulable. 

Jeffay, ct al. [JSM91], have shown that the problem of determining whether a 
feasible schedule exists for a particular concrete task set is NP-Hard. 

C. THE HARMONIC BLOCK DILEMMA 

It is a well known and accepted result that the least common multiple (LCM) of 
the periods of a periodic task set provides a finite interval of time, for which a cyclic 
schedule can be calculated, if one exists, and repeated forever [Mok83]. 

Many interpret the above statement to mean that a cyclic feasible schedule must 
only exist in the closed interval [04.<CM], i.e.. a feasible schedule for all tasks instances 
that must start in the interval [0J..CM] and complete execution by time LCM. Such an 
interpretation holds only if the first instance of every task Tx is restricted to complete its 
execution by time PERx. But what if such a restriction is not desirable? It seems very 
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reasonable to allow the &st instance of a periodic task to start within its period of 
activation but finish up to the end of the period plus its computation time, and actually this 
would be a very desirable property, if it could somehow improve the already difficult 
problem of non-preemptive scheduling. 

Consider the task set Ti(190,600,600) and 12(20,200,200) with the precedence 
relation Ti < T 2 , as illustrated in Figure 3.5. 



Figure 3.5. The Transient and Cyclic Schedules 


Qearly, no feasible schedule exists if the first instance of every task T, is restricted 
to complete its execution by time PER,. However, if it is allowed to the first instance of 
every task T, to start by time PER, and complete its execution by time PER, + MET,, 
then a feasible schedule exists. Note also that the cyclic schedule no longer starts at time 
zero, but starts instead at time tc, and furthermore, there can be more than one task 
instance that does not finish by tim e 2xLCM, as can be illustrated by the task set 
Ti(4,100,100), T2(2,5,5), T3(2,100,100) and T4(3.10,10). with precedence relations Ti < 

T2<T3<T4. 

Here is where a novel approach on how to determine what is a suitable cyclic 
schedule comes into play. The fundamental concept is that a feasible static schedule 




consists of two parts: a transient part, which may be empty, followed by a cyclic part, 
which repeats forever. 

The next theorem, the Harmonic Block Theorem, although different finom the one 
introduced by Zhu, et al. [ZLC94], was created after a careful analysis of their work, 
which does not correctly solve the problem. The general direction of the proof will 
consist in showing that if the premises of Theorem 8 are satisfied, then there exists some 
time tc where a part of the schedule can be divided, with exactly the size of one LCM, 
where it is guaranteed that the correct number of task instances are present, and most 
importantly, that they all start and finish within that time interval, characterizing the cyclic 
part of the new schedule. 

Theorem 8: The Harmonic Block Theorem 

“If 3 an infinite feasible schedule S without any inserted idle time for a periodic 
task set P with precedence constraints, such that the first instance of every task, Tx in P 
must start by time PER*, then there exists an infinite feasible schedule S’ consisting of a 
transient portion of length at most LCM, followed by a cyclic portion of length LCM that 
repeats forever.” 

Proof: 

If there is no idling time period in the intervals [OJ-CM] or [LCM,2xLCM], then 
the given set of periodic tasks P must have a load factor of 1, and the first instance of 
every task T, must finish its execution at or before time Px in any feasible schedule. 
Hence, the segment of S in the interval [OL-CM] forms the cyclic portion of an infinite 
feasible schedule satisfying the Theorem. 

Suppose now that idling time exists in the intervals [OXCM] and [LCM.2xLCM]. 
Let tc be the end of the last period prior to time LCM in which the processor was idling in 
S, and let ti be the end of the last period prior to time tc+LCM in which the processor was 
also idling in S as shown in Hgure 3.6. 
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Figure 3.6. Determining the Start Time tc of the Cyclic Schedule 


Assertion (1) 

Since no unnecessary idle time is inserted in our schedule S. it should be clear that 
there cannot be my first instances of tasks being activated after time tc, because otherwise 
they could have started execution before time tc. 

Assertion (2) 

Another important point to be made is that all tasks which stan after time ti could 

not be activated before time t,, for the same reasons of non-insened idle time in our 
schedule S. 

Assertion (3) 

Every task mstance that is activated in the interval [ti,tc+LCM) must finish its 
execution at or before tc+LCM. Suppose this claim is not true. Then there must exist 
some instances which are activated before tc+LCM and cannot finish at or before tc+LCM. 
Denote the collection of all instances which are activated in the interval [t,. tc + LCM) by 

X. It follows from assertion (2) that every instance in t must be activated in the interval 
Itj,tc+LCM). This implies that 

SmETix^ tc+LCM*tj 

Let t' denote the set of task instances that are activated in the interval [tpLCM.tc). 

It foUows from assertion (1) that every task instance in t must have a corresponding 
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Note that all instances in x' must finish within the interval [ti-LCM,tc], because tc is 
the end of an idling period. Hence, 

X METiy < tc - (ti-LCM) = tc+LCM-ti (iii) 

Tjyex' 

From inequalities (i), (ii), and (iii), 

tc+LCM-ti < XMETix < tc+LCM-ti, 

TueT 

which is a contradiction. 

Assertion (4) 

All instances after tc are at least second instance and hence, for all tasks T* within 
the interval [tc ,tc+LCM), there must exist activations. By assertion (3) they aU 

finish within this same interval. The segment of S in the interval [tc ,tc+LCM) contains the 
correct number of instances. 

Concluding the proof, it can be said that the intervals [0,te] and [tcU+LCM] of S 
form respectively the transient portion and the cyclic portion of the new schedule S’, 
satisfying the consequence of the Theorem. □ 

As can be seen, by a proper choice of the start time of the cyclic portion of the 
schedule, one can increase the schedulability of tasks sets which were previously assumed 
to have no feasible schedule, when the cyclic schedule was restricted to always start at 
time zero. Note also that the same approach is valid for preemptive task- sets. 

D. A NOTE ABOUT PRECEDENCE CONSTRAINTS 

Eveiy reference to the word precedence constraints between tasks is usually 
attached to the meaiung of synchronization, in other words, if two tasks have some kind of 
precedence relation, then they must be synchronized. Furthermore, if their periods arc 
different, then they should be synchronized at intervals corresponding to the least common 
multiple of their periods. But then, what is the real need for synchronization if there are 
cases where some data may well be lost? Does it exist only to enforce a fixed pattern on 
how data are lost, e.g., instances three from task X and two from task Y, sbe and four and 
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so forth will synchronize? These and other questions will be much further discussed in 
Chapter FV. 

We shall argue in Chapter IV that the major reason for synchronization is to 
guarantee timely processing of triggering data. We shall show that, by relaxing the upper 
bound on the delay in processing each instance of triggering data, we can guarantee that, 
even without explicit synchronization, each instance of the trigger data will be processed 
within an interval equal to two times the period of the consumer operator. The removal of 
the need for synchronization is particularly inqwrtant in distributed systems, where 
synchronization mechanisms are very costly if not impossible. It is also desirable not to 
have synchronization in uni-processor systems, because now, we can treat each 
topological ordering of the tasks satisfying the precedence relationships as a concrete set 
of periodic tasks, where the starting time of task T, is greater than or equal to the sum of 
the METy of all tasks Ty that are ancestors of T* in the task graph. 

Note that if non-zero latency is present in the edges of the precedence graph, then 
we must further delay the starting time of the first instances of every task Y, so that Siy > 
max{Su+METx+LAT,y , Vparent operator T* of Ty), where LAT:iy denotes the latency 
associated with the edge (T,, Ty). 

In order for the arguments in the proof of Theorem 8 to hold, we need to choose U 
to be the end of the first idling period after time LCM, resulting in a Modified Harmonic 
Block Theorem that reads: 

Theorem 9: 

*Tf 3 an infinite feasible schedule S for a periodic task set P with precedence 
constraints, such that the first instance of every task, Ty in P must stan by time PERy, then 
there exists an infinite feasible schedule S’ consisting of a transient portion of length at 
most 2xLCM, followed by a cyclic portion of length LCM that repeats forever.” 

Proof: 

The main difference when dealing with latencies, is that idling periods may exist 
before the starting time of the first instance of some task T» in the schedule. Theorem 8 
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srill holds for this case, because the presence of idling time only affects the release time of 
the tasks, as long as PERy ^ Siy ^ max{ Six+METx+LATxy }. However, for Theorem 8 in 
Section C, the cyclic portion of the schedule may now start after time LC!M. The reason is 
because the schedule S may contain first instances in the interval [tc, tc+LCM], which was 
the key in our previous proof of Theorem 8. After these considerations, the same proof 
used for Theorem 8 can be applied to this case. O 

E. COPING WITH APERIODIC TASKS 

Generally speaking, a sporadic task is defined as an apoiodic task that has a 
minimum duration between two consecutive activations. If that was not so, neither the 
static nor the dynamic approach could be used to guarantee schedulability. 

If interrupts are used to detect the occurrence of aperiodic events at run-time, then 
a dynamic approach should be used. However, in the static scheduling framework, where 
all the tasks requests must be known a priori, so that a fixed and static schedule can be 
generated, the only way to handle sporadic tasks where we do not know exactly when 
they are going to happen, is by using a periodic process to function as a polling device. Its 
main role is to check for requests of sporadic tasks and to serve them during its allocated 
time slot. However, due to the random nature of aperiodic processes, we may not be able 
to handle a concentrated set of arrivals or even worse, not catch them at all with the 
sporadic server approach. To overcome this difficulty, several bandwidth preserving 
algorithms have been proposed. Among them could be mentioned the Priority Exchange, 
Deferrable Server and the Sporadic Server. [AB93] 

The CAPS approach was to use one sporadic server for each time-critical sporadic 
operator. This approach, although very restrictive, is the only way to guarantee that all 
time-critical sporadic tasks would be serviced in a timely fashion under the worst case 
situation. 

Therefore, the next step is to conven the sporadic operator into a periodic one so 
that all the original timing constraints from the sporadic operator are still satisfied. 
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1 . 


The Conversion 


The term triggering period (TP) will be used for the period of the converted 
sporadic operator and the usual term FW for its finish-within. As shown in Figures 3.7 
and 3.8, basically two cases can occur; 

The first is when MCP < MRT - MET and the equivalent periodic operator must 
have TP ^ MCP in order to satisfy the original timing constraints. Also, must enforce that 
FW = MRT - MCP, so that in the critical case shown in Figure 3.7, the data that was 
missed by the previous triggering period can be consumed by the next TP and still finish 
within the original MRT. 



The second case, shown in Figure 3.8, occurs when MRT - MET ^ MCP. This 
more constrained situation forces a further reducdon in the triggering period. Thus, the 
new TP should be TP £ MRT - MET and the FW should be equal to MET. 





Figure 3.8. The Sporadic CtMiversion when MCP ^ MRT-MET 

In general, the triggering period should be 

MET < TP ^ imn(MRT - MET, MC3>). 

Nevertheless, in order to minimize the impact on the load factor of the prototype, 
it is desirable that TP be as large as possible, meaning that 

TP = min(MRT - MET, MCP). 

Now, assuming that the values for TP and FW have been established, so that the 
original timing constraints of the sporadic operator are satisfied, let's see what kind of 
relations should exist between the original values, so that we could validate them. 

Qearly; 


MET^MRT 

MET^MCP 

METSTP 


(by Theorem 2) 
(by Theorem 1) 
(by Theorem 1) 


Eq. (1) 





• TP ^ MCP (for static scheduling)^ 

• MET^FW^TP (Scheduling Model) 

For case A: MCP < MRT-MET 

Eq.(2) 

TP = MCP 

Eq. (3) 

and 


FW = MRT-MCP 

Eq.(4) 

Plugging (3) and (4) into (2), 


MET ^ MRT - MCP ^ MCP 

Eq.(5) 


From the right inequality of (5), 

MRT<2xMCP 

Plugging (1) into the left inequality of (5), 

MRT^2xMET 

For case B: MRT-MET < MCP 


TP = MRT-MET 
and 

FW = MET 

Plugging (6) and (7) into (2), 

MET^MET^MRT- MET 
From the right inequality of (8), 


Also, 


MRT^2xMET 


MRT-MET^MCP or MRT-MCP^MET 
Plugging (1) into the above inequality, 


Eq. (6) 
Eq.(7) 
Eq. (8) 


MRT-MCPSMCP or MRTS2xMCP 
Therefore the MRT for a sporadic operator must be upper bounded by twice its 
MCP and lower bounded by twice its MET, as follows: 


* Otherwise we would have to be able to detect at run-time when new data had arrived, only possible 
with dynamic scheduling. 
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2xMET^MRT<2xMCP 

Note that when MRT assumes its lowest possible value, which is 2 x MET, the 
triggering period TP will also reflect its lowest possible value, which is MET, with FW 
still being equal to MET. This case is illustrated in Figure 3.9. 



Note that in both cases the conversion of a sporadic operator results in very 
stringent timing constraints to the equivalent periodic operator. This will definitely have a 
great impact on the schedulability of the prototype. In the second case, for example, there 
is no slack time for the converted operator, since FW = MET. This forces us to remove 
out portions of MET from the schedule, where no other operator could be scheduled. 

Of course, the amount of slack time for this operator can be increased by 
decreasing its TP, but this will also increase the entire load factor. Basically, there exists a 
trade-off between load factor and slack time. How much to increase one in detriment of 
the other to increase schedulability is a very difficult question. 
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While this question does not have an answer, it does offer suggestions to help 
designers in finding solutions that best fit their needs. 

When converting a sporadic operator into an equivalent periodic one, the 
triggering period (TP) can range fix>m a mininauin of MRT/2, where the slack time is equal 
to MRT/2 — MET, up to a maximum value equal to min(MRT-MET, MCP), implying that 
the slack time is max((MRT-MET-TP), 0). 

MET MET . . 

First, define load factor contribution as LFC = __-«i-c., the ditierence 

TP TPm« 

between the corresponding LF for a specific triggering period TP, and the load factor if 
TP were set to its maximum value. Within the interval MRT/2 < TP< min(MRT-MET, 
MCP), the slack time ST, which is the scheduling interval for the sporadic task minus its 
computation time, is defined as ST = MRT - MET - TP, as can be derived firom Figures 
3.7 and 3.8. 

Qearly, when TP is maximum, the load factor contribution (LFC) is zero, in the 
sense that it cannot be increased any further. For the other values of TP, including those 
enforced in the conversions for the previous cases A and B, some considerations must be 
taken into account. Assume that MCP ^ MRT-MET. Although it may appear at first that 
LFC varies with MRT, since TP is lower bounded by MRT/2, that is not the case, in other 
words, MRT only limits the valid range for TP. Figure 3.10 shows a famDy of curves for 
different values of MCP, and for a fixed value of MET and MRT. As explained earlier, 
LFC is insensitive to changes in MRT. 

The load factor contribution LFC, as previously defined, is a function inversely 
proportional to the triggering period TP, and that it will decrease faster for periods less 
than TP- =‘>/MET , where its first derivative with respect to TP is equal to -1*. Note, 
however, that TP cannot be smaller tfian MET, meaning that TP^ wll always be located 


* Care must be taken to the fact that the derivative at some point being equal to -1. docs not imply 
that the slope equals 135“ at that point, since both axes may have difTercnt scales, as shown in Figure 3.10. 
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to the left of any valid value for TP. The main conclusion is that different values of MCP 
have very small effect in the variation of LFC. Similar conclusion can also be drawn for 
the case where MCP < MRT-MET. Therefore, in any case, the consequence is that we 
always have the full range of TP, ftom MRT/ 2, up to min (MRT-MET, MCP) to change 
TP, without causing any harm to the load factor of the system. 

0.5 
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150 200 250 300 350 400 450 500 550 600 
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Figure 3.10. Effects of TP on the Load Factor 

Note that the very first question remains unanswered, but now, the effects in the 
total load factor are more clearly understood when the triggering period is changed. 

2. Important Remarks about the Conversion 

This first idea of conversion of sporadic operators was introduced Mok 
[Mok83] in his Lemma 2.3 which stated 

“Let M = Mp u M, be an instance of a process model. Supp>ose we 
replace every sporadic process Tj = (Ci,pi,di) € M. by a periodic process T’l 
= (c’i,p’i,d’0 with c’i= Ci, p’i = min(di-Ci+l, pO and d’i = Cj. If the resulting 
set of all periodic processes M’can be successfully scheduled, then the 
original set of processes M can be scheduled without a priori knowledge of 
the request times of the sporadic processes in M,.** 
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Note, however, that although the idea of the transformation is valid, care must be 
taken to see the context in which that sporadic operator appears, since some of its 
attributes, such as minimum calling period, are totally dependent upon the producer of the 
triggering data and not on the sporadic operator itself. In other words, if the producer of 
data for some sporadic task is an external event that will be handled by some kind of 
intermpt handler, then there wiU be no influence whatsoever in the generation of the data, 
and the minimum period will be obeyed by the external device. However, if the producer 
is another task that will be included in our static schedule, it must be assured that two 
consecutive instances of the producer operator will not be scheduled closer than the 
minimum period specified for the sporadic consumer. In this case, the transformation 
alone is not enough, and an additional restriction must be imposed on the producer of the 
data. This situation is depicted in Hgure 3.11. 

In conclusion, it can be said that Mok’s lemma by itself does not guarantee that a 
schedule really exists for the original set, even if the resulting set of all periodic processes 
M’ can be successfully scheduled, unless as explained earlier, a restriction is imposed on 
the producers as well. 



Figure 3.11. Restrictions on the Producer Imposed by the Consumer’s MCP 
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3. Implementations Issues about the Conversion 

When imp lementing this conversion it is strongly recommended that a careful 
analysis of the task graph be made to determine reasonable bounds for the period of the 
transformed sporadic operator. At &st glance, an obvious upper-bound is the value of its 
MCP. However, for lower-bounds diis choice is not so clear. Nonetheless, it is assumed 
that after this pre-processing there will be an interval of posable values for the period of 
the transformed sporadic task. The reason for these bounds is to provide us with some 
mar gin for makin g the conversion, so that the final harmonic block of the entire set is not 
increased significantly. 

Given a set of sporadic operators, the following steps are suggested for the final 
choice of their periods; 

1) Set the period of every sporadic task to its upper-bound, so that the total load 
factor is minimized 

2) Try to find a feasible schedule for the entire prototype (if this is not possible 
pick one sporadic task) 

3) Start decreasing its period; 

4) For each new period check for schedulability; 

5) Proceed until its lower-bound is reached. If no schedule is found reset its period 
to the upper-bound, pick another task and go back to step 3; 

Another possible heuristic is to assign the smallest period among the periodic 
operators which is closest to but smaUer than the upper-bound of the sporadic operator, 
and then proceed with the schedulabili^ tests. One could also try to mi nimiz e the 
harmoruc block. As can be seen, there are several possible heuristics, but there is no 
<^timal solution. Nevertheless, it is understood that, due to the very stringent timing 
constraints resulting fiom the conversion, every possible attention should be given to this 
step. 
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IV. DISTRIBUTED SCHEDULING 


A. INTRODUCTION 

For uniprocessor systems, most scheduling problems involving precedence 
constraints can be solved in polynomial time. Lawler [Law73] showed that scheduling 
non-preemptable tasks with unit computation times, deadlines, and arbitrary precedence 
constraints can be accomplished using the Latest Deadline First Algorithm in 0(n^) time. 
Similar results wo-e obtained 1^ Lageweg, Lenstra, and Kan, even for tasks with an 
arbitrary computation time, if the release times were assumed to be zero for all tasks. 
Blazewicz [Bla76] proved that, for this scheduling problem, a preemptive schedule exists 
if and only if a non-preemptive schedule exists. Therefore, in this case, preemption need 
not be considered. Blazewicz also demonstrated that the Earliest Deadline First algorithm 
can also be used to schedule preemptable tasks. The only scheduling problem involving 
precedence relations that has been proven to be NP-complete is the non-preemptable case, 
where no restrictions are placed on the release times nor on the computation times. The 

non-preemptable case is also NP-complete if there are no precedence relations among the 
tasks [GJ77a]. 

Scheduling tasks with precedence constraints in multiprocessor systems is much 
more difficult than doing so in uniprocessor systems. For example, scheduling tasks with 
arbitrary precedence constraints and unit computation time is NP-hard both for the 
preen 5 )tive and the non-preemptive cases [U1175, U1176]. 

Many researchers have attempted to develop efficient heuristics algorit hms to 
solve the general problem, but with limited success. In most cases, the researcher ended 
up restricting the solution space for specific cases, such as when the task graph is a forest, 
or when there are no precedence constraints. 

In general, two different approaches to handling distributed computation can be 
identified. In the first, the distributed system is coordinated by a single system clock, 
which synchronizes all tasks so that computation progresses in a lock-step fashion, and 
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communication between tasks can only occur at specific times. In the second approach, 
tasks are synchronized only when necessary, and do so by executing appropriate hand¬ 
shake protocols. The former approach requires less inter-processor communication, but is 
rigid, and relies on a global clock whose in:: 5 )lementation is by itself another very difficult 
problem to solve. The latter approach, although more flexible, dramatically increases the 
complexity of the synchronization problem, and may be very costly in terms of 
r. nminnnif.atin n, since many acknowledge signals must be exchanged in order to maintain 
propCT synchronization. The use of rigorous and more constrained timing requirements 
allows for the establishment of a weak form of synchronization among the tasks of the 
distributed system, and represents an alternative in the middle [Mok83]. 

B. ARCHITECTURAL ISSUES 

This section is not intended to present an in-depth analysis of the effects of the 
architecture on distributed scheduling, but merely to introduce some of the problems so 
that the reader may be aware of their existence and importance. 

In a distributed environment, it is very likely that one will have to deal with 
heterogeneous computers, each one with a different clock, different memory systems, and 
so forth. It is therefore important to realize how these attributes can affect scheduling. 

1. Different Clocks 

The precision of a clock is directly related to its granularity, the minimum number 
of ticks it can handle, and the quality of its time reference, which is usuaUy based on some 
kind of crystal. The first limiting factor imposed by the clock, therefore, is the minimum 
acceptable period. This is not, however, an actual limitation, since typical clocks range 
fiom tens to hundreds of megahertz, providing an order of nanoseconds for the rrnnimum 
allowable period. The real problem is that clocks can drift airwng themselves, causing a 
variety of synchronization problems. Maintaining an accurate global clock is one of the 
most challenging tasks in the distributed processing arena. Usually this is achieved at the 
cost of substantial overhead in communications. 


70 


2. Speed of CPUs 

The net result when different processors are present is a different execution time 
for the same piece of code when running in the various processors. This factor 
necessitates previous knowledge of allocation by the scheduler, so that it can be taken into 
account. Within CAPS, this is accomplished automatically, because a kind of simulated 
timft is used for scheduling, which is scaled according to the speed of the machine on 
which it runs. 

3. Memory 

Issues like cache size, paging, number of pipelining stages, etc., can affect the 
overall throughput of the system, and consequently the timing requirements, but hopefully 
all of these different delays are already taken into account by the specified maximum 
execution time of the task. 

4. The Communication Media 

This is one of the most inqxirtant factOTS in dealing with distributed systems, and 
can greatly affect final timin g requirements for the application. Note also that the timing 
requirements are affected not only by the actual transmission delay, but also by the 
operating systems functions invoked on behalf of the applications. In CAPS, for example, 
although there is a time-bounded protocol (FDDI) it is still necessary to make calls to the 
underlying Unix operating system, which has no support for real-time applications. 

5. Interconnectivity 

The number of processors, the distance by which they are separated, there abilities 
to co mmuni cate with one another, etc., are issues that should be raised before tackling the 
scheduling problem. 

C THE PROBLEM STATEMENT 

To reiterate, the original objective of this research was to find better methods of 
supporting efficient and reliable scheduling of distributed hard real-time systems. 
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It is unquestionable that the ideal real-time distributed system should be able to 
support groups of tasks running asynchronously in different processors, each processor 
having its own internal clock. An additional goal, despite the precedence relations among 
the tasks, would be to eliminate the need for enforcement of any kind of synchronization 
required for communication. An even more in^rtant goal would be that all the deadlines 
and other requirements (such as no loss of data, etc.) could be met. 

Being aware of the complexity of the message routing problem described in 
Chapter I and reviewing the alternatives presented in Section A, it appears to be that the 
best available option to achieve the ideal system is die very last alternative, i.e., to sacrifice 
timing constraints in order to decrease scheduling complexity. Unfortunately, that is not 
the current trend in most researches in the field of distributed scheduling today. 
Researchers are still trying to find better heuristics to scheduling algorithms so that the 
timing complexity for a sub-optimal case is decreased by some constant factor. But, due 
to the NP-Hard nature of the problem, it is most likely that some restrictions will be 
imposed on the irtitial problem. 

This work moves in the other direction, in other words, investigating ways of 
restricting or relaxing the timing requirements so as to increase the chances of finding a 
feasible schedule. It is understood, however, that, depending on the application, this 
approach may not be practicable. It may well be that most of the timing requirements 
cannot be changed at all. However, this is most likely untrue for most cases. Especially in 
this applications framework, where the user is prototyping the intended system in the early 
stages of its life cycle, there is an opportunity to validate and change the system’s 
requirements, which makes this approach very attractive. Note, however, that this 
discussion is not about missing deadlines or etr 5 )loying inprecise computations [LLS91], 
but focuses simply on relaxing timing constraints so that no synchronization is needed, and 
consequently decreasing substantially the complexity of the distributed scheduling 
problem 
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The next section addresses the underlying semantics behind all possible 
combinations of triggering conditions, stream types and operator types within a valid 
PSDL program, so that later, when discussing the major synchronization issues, it is 
certain that all cases have been covered. 

D. SYNCHRONIZATION IN PSDL 

Thwe are two kinds of streams in PSDL, Sampled Streams (SS) and Data Flow 
Streams (DF). Note, however, that within the former are two semantically different sub- 
types of streams, depending on the triggering condition of the consumer operator. If the 
consumer operator is not triggered (NT) by any data, then it should be understood that a 
specific data value can be lost or overwritten, or even read over and over again by the 
consumer, without any harm to the system. This type of behavior is very useful when 
reading sensor data. In most cases, the sensors will be able to generate data in a much 
higher rate than the consumer will read it, but the most recent data is of primary interest. 
Even for tracking systems, where the history of data values is very important, this kind of 
stream is still very useful Note in Figure 4.1 that a specific value at some previous time t 
is not relevant, because the consumer is only interested in the average behavior, so that the 
filter algorithm can piedia the future position of the target In this kind of situation, no 
S3nichroiuzation is needed, releasing the producer and consumer operators from any 
constraints on their periods. 
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The second type of Sampled Stream exists when the consumer operator is 
TRIGGERED BY SOME (TBS) data value. By definition, the consumer with this 
triggering condition should always catch a new piece of data if it is from one of the 
streams specified in the TRIGGERED BY SOME clause. For example, if some operator 
OPl is TRIGGERED BY SOME X, Y, then, if new data is coming from either X or Y, it 
should be guaranteed to be read, and not lost or overwritten. 

Although buffer overflow or underflow is not an issue, due to the way sampled 
streams are defined, the only way to avoid loss of data in this case is to enforce the 
condition that PER . > PER , and, consequently, the synchronization problem 

will have to be handled accordingly. 

Hnally, in the case of Data Flow Streams, where the consumer is TRIGGERED 
BY ALL, the inputs specified in the TRIGGERED BY ALL clause for new data should be 
examined, and if all of them happen to have new data in their buffer, they should be 
consumed, firing the operator. The TRIGGERED BY ALL condition can be thought of 
as being a logical AND among the streams declared in the TRIGGERED BY ALL clause. 
Qearly, in this case, there is also a need to enforce PER_. t PER so that no data 

^ OCOuUCCf ooosu^Dcr 

is lost, and once again the synchronization problem must be handled explicitly. 

The basic semantic difference between the TRIGGERED BY ALL data flow 
streams and the TRIGGERED BY SOME sampled streams is that if for any reason the 
data is not consumed and another piece of new data arrives, in the former it will raise a 
buffer overflow exception, while in the latter the data will be simply overwritten. 

E. DEALING WITH SPECIAL CASES 

Data flow streams are currently implemented in CAPS as a FIFO queue of buffer 
size one. This imposes an in^rtant restriction on the PSDL program, that is, all 
producers of data flow streams to some unique consumer should have the same period, or 
a FIFO buffer overflow may occur in one of the streams, even if the condition 
PERproducer > P^Rconsumer ^ satisfied (Figure 4.2). This happens because OPl may 
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write twice before 0P2 outputs some value so that the triggering condition can be 
satisfied. This problem usually reflects a possible design error, because it makes no sense 
to have an operator being triggered simultaneously by two data events that are produced 
with different rates. A possible and recommended solution is to force all producers of 
data flow streams to a unique consumer to have the same period. 



Figure 4.2. Producers with Different Periods 


Another important issue is that, although it is semantically correct in PSDL to have 
several operators writing to the same data flow stream, or even to the same TRIGGERED 
BY SOME sampled stream, as illustrated in Figure 4.3, this case cannot be handled unless 
an upper-bound is placed on the number of concurrent copies of a stream in a PSDL 
program. This restriction is due to the fact that streams have limited buffer size, and if the 
number of copies is very large there is no way to guarantee that one operator will not 
write to the stream right after the other, and therefore cause an overflow. In the 
uniprocessor case, the only way to handle this problem is by imposing very hard 
restrictions on the period of the consumers, so that it will be limited to, at most, half of the 
minimum MET of the producers. This result may be seen as an extrapolation to this case 
of Nyquist’s well known sanpling period theorem. Currently, CAPS does not enforce this 
condidon. 
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Hgure 4.3. Potential Overflow Situation 


Still, due to the powerful semantics of PSDL, there is another problem to solve, 
which is the possibility of the same stream being data flow for some consumers and 
sampled stream for others, as iUustrated in Figure 4.4. To make things worse, these 
streams can even have different latencies. 



Figure 4.4. Different Stream Types Combination 


Actually, there are some other cases that could also be cleverly checked, so that 
users could receive some suggestions and warnings about their design, like for example in 
the ca y illustrated in Hgure 4.5, where OPi could have its penod increased and 
consequently lowering the load factor, since it will not do any good to keep its period 
smaller than OPj. 
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Figure 4.5. Period Incompadhility among Operators 

As one can expect, the above cases make the validation process of a PSDL 
program very complex. For the sake of completeness, the semantic checks and scream 
type derivations for all possible combinations of operator types and data triggering 
conditions in PSDL are listed in Table 4.1. The actions which should be taken by the 
scheduler for each one of those possible combinations will also be presented. 
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Table 4.1. PSDL Data Triggering Semantic Table 


LECgNP 

TC«Taii»€iiacal Operator P-PtaoibcOpmiflc/PaMid SS • Sampled Sow 

NTC • hkshlin^Cki&caJ Opemor S • Sporadic Operator DF ■ D«a Flov ScieetD 

In Table 4.1, "upper" and "lower" represent, respectively, the maximum and the 
minimum values the equivalent period of the sporadic operator can assume. They are 
initially set, respectively, to infinite and zero. "Actual" is the value of the triggering period 
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of the sporadic operator after the conversion is done. As can be seen in Table 4.1, in all 
TEUGGERED BY ALL cases it is necessary to prevent, or at least give warnings, 
whenever the producer operator is faster than the consumer, so that no loss of data or 
overflow will be incurred [Table 4.1(1)]. Similarly, in the TRIGGERED BY SOME 
cases, this constraint must also be enforced, but in this case the motivation is to prevent 
loss of data, since Sampled Streams, by definition, do not overflow [Table 4.1(2)]. 

When dealing with ^radic operators upper and lower bounds are defined for 
their triggering periods, so that later, when conversion of the sporadic operators to 
equivalent periodic operators takes place, it is certain that all of these constraints are taken 
into consideration [see Table 4.1(3)]. The sporadic to sporadic case (S-S) cannot yet be 
handled with upper and lower bounds, since there can be up to five different possible 
overlapping patterns for their period interval. Hence, final checking of this case will be 
delayed until the equivalent periods have been calculated [Table 4.1(4)]. 

Another important point to mention is that consumers with no data triggering 
condition must be periodic, or an error will be raised [Table 4.1(5)]. 

Finally, although very unlikely to happen, it should be pointed out that it may 
happen, for unexpected reasons, such as a lot of slack time left over from the static 
scheduler, that some non-time-critical operator may be fired more than once in the same 
Harmonic Block, leading to a possible overflow if they are connected by data flow streams 
to time-critical operators [Table 4.1(6)]. This is not a concern among NTCs, since all of 
them will be executed consecutively, in other words, between two consecutive instances 
of any NTC operator is guaranteed to have an instance of all the remaining ones [Table 
4.1(7)]. 

Table 4.2 presents all possible combinations of the PSDL timing constraints and 
the resulting actions and checks to be performed by the scheduler. 
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Table 4.2. PSDL Timing Constraints Semantic Table 

LEGEND 
N B Noc Supplied 
Sb Supplied 

Table 4.2 shows that very few combinations of PSDL timing constraints are 
semantically acceptable. The only one that deserves some explanation is the case where 
only the MET is supplied. In this case, the scheduler picks up a pair of values for MCP 
and MRT, so that the individual load factor of the sporadic operator is equal to 

max((0.75 ~ 5^ LFpg|^), 0.1) 

# of sporadic operators 
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This approach relieves the designer from having to define timing constraints for 
sporadic operators, which might not be clear yet, at that stage of the prototyping, and it 
also tries to decrease the timing requirements for that sporadic operator. However, it is 
dangerous, in the sense that it will always increase the load factor of the prototype to at 
least 0.75, even if the total load factor for all periodic operators was very low. 

As is apparent, most of the semantic checks, mainly those related to the control 
constraints part of the PSDL program, such as data triggering checks and liming 
constraints checks, are left up to the scheduler to inqilement It is proposed that in the 
future CAPS releases some of these checks arc taken fiom the scheduler and inserted into 
the Syntax Directed Editor (SDE), so that the user is not allowed to proceed to the 
translation step until he has a valid PSDL program. In doing so, the designer will not have 
to come all the way back to SDE if a semantic error is found. 

F. TACKLING THE SYNCHRONIZATION PROBLEM 

It is clear that the most irrqxrrtant issues in dealing with synchronization are the 
periods of producer and consumer tasks. However, even in the uiuprocessor case, with the 
period of the consumer being smaller than the period of the producer, it can be easily 
shown that the synchronization is not always a good alternative. Hgurc 4.6 shows an 
cxan 5 )lc where no feasible schedule exist if synchronization is enforced, but it docs exist 
otherwise. Three outcomes arc possible if the synchronization is not required. First, if the 
consumer operator is TRIGGERED BY ALL X,Y , the proposed schedule is valid but X 
and Y will be consumed one instance later. If it is TRIGGERED BY SOME X,Y , then 
the schedule is always valid, because X and Y do not need to be consumed together. 
Rnally, if there is no trigger, then the relative order is not important anyhow. 
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Figure 4.6. Reason for No Synch when PERpnd ^ PERcom (Uniprocessor Case) 


From another perspective, if PE R^ , < PER . . then the streams connecting 

them should be sampled streams, because otherwise the data flow streams would 
overflow. Since the loss of data is possibleCpossible" because the data might well not be 
produced at all) the consumer cannot be TRIGGERED BY SOME either. 

The only case in which PER . < PER can be allowed is when there is no 

trigger at all. In this situation, synchronizadon is not needed, since it would place one 
addidonal burden on the scheduler, and would not solve the problem of loosing data. The 
only advantage to having synchronizadon points in this case is the fact that there would be 
a fixed pattern for losing data. Furthermore, by not having explicit synchronizadon, the 
most that could happen is that the consumer operator would read either the previous or 
the next instance of the data output by the producer, in other words, at most one producer 
period apart 
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Rgure 4.7. Reason for No Synch when PERproj < PERecm (Distr. Case) 


The second possitriliQ^ is PER ^ ,,, , ^ ^ case, the synchronizadon 

also does not solve the problem, since it is possible to have two instances of the producer 
operator being scheduled, one after the other, causing overflow or loss of data depending 
on the triggering condition. This case is illustrated in Figure 4.8. 



At first, one may conjecture that no synchronization is needed when PER^ ___ S 

PER since it would be possible to catch every single occurrence of data ever 
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produced. However, this conjecture is untrue, due to the fact that the periodic input is not 
periodic in the common sense that is understood in electrical engineering and other related 
fields, as a pulse that occurs every t units of time! 



Rguic 4.9. Synchronization among Periodic Operators when FWa = METa 


If that was so, the period ratio among producer and consumer would be a 
necessary and sufficient condition for guaranteeing synchronization, according to the 
following argument: 

Assuming that PERg < PERy^ (Eq. (1)) and that the phase of operator A is zero, 
there could be two cases: 

1st cose, start of second instance of B is less than finish of second instance of A 

S 2 B<* 2 A- Eq.(2) 

In this case B just lost A, and therefore it is necessary to prove that the third 
instance of B will certainly catch the second instance of A. Formally 

S3B<f3A 

By the definition of periodic operator, and from Eq. (1), 



84 






Eq.(5) 


Plugging equations (1) and (2) into (3), 

S3B < f2A + PERa 

Finally, combining (4) and (5), 

S3B < fsA □ 

2nd. case: > 4a* case where the second instance of B will catch the 

second instance of A. □ 

In general, Sj^ < s^ inplies < s^j^ and hence, neither loss of data or buffer 
overflow can happen. 

However, as explained before, this periodic definition is slightly different, in the 
sense that it may occur anywhere inside the period slot, invalidating our previous 
argument 

Within this framework, things are made much more complex, and the 
synchronization approach needs to change considerably. 

The key question to be answered is: What is the real need for synchronization 
between two operators, and when is it applicable? As shown in the previous examples, 
the synchroiuzation is not solving the problem and it is placing an additional burden on the 
scheduler. 

Other question to be asked is: 

Can every single piece of data coming from both dataflow streams and from 
TRIGGERED BY SOME sampled streams be guaranteed to be consumed in a timely 
fashion, so that no overflow or loss of data occurs? 

The answer is clearly yes, if after scheduling each producer of a data flow or 
TRIGGERED BY SOME sarr^Ie stream, the consumer of that data flow stream, or of 
that sampled stream, is scheduled before the next instance of the producer. 

In a uiuprocessor case, or even in a shared memory multiprocessor model, this 
approach is acceptable and easy to implement and guarantee. This, by the way, is how it 
is in^lemented right now in CAPS. However, in a truly distributed case, besides the 
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difficulty in implementing this approach, the lack of a master clock might cause a feasible 
schedule to become unfeasible. This assertion may be illustrated with a simple example. 
Assume a schedule for a two-processor system that meets all deadlines and 
synchronization requirements among their tasks, and that no buffer overflow occurs with 
respect to the data flow streams. Now, if clock drift occurs in processor 2, so that one of 
its consumers gets shifted more than twice the period of its correspondent data flow 
producer, the consumer is guaranteed to lose d at a, and the schedule will faiL 

Therefore, although the uniprocessor and the shared-memory multiprocessor cases 
can be handled appropriately, a new ^proach must be developed for the distributed case. 
Ideally, several sets of communicating processes would run independently in each 
processor, but with the guarantee that no data would be lost and no deadlines missed. 

It will be useful to review the synchronization problem between producers and 
consumers. What is the real meaning of misang a deadline within the context of a real¬ 
time system? It means that some process did not generate its output within the specified 
amount of time, and therefore the consumer could not consume the data, and so on. What 
is important here is that missing deadlines are always attached to data not being generated 
or consumed in the proper timing, and this is going to be the key-point in the approach, 
Lc., attempting to guarantee that all data being generated is consumed in a timely fashion. 

Qcarly, the very first condition that must be satisfied is that PERp,,,^^ ^ 
PER so that no data is lost It also seems obvious at first, that the worst case that 

can ever happen is when two consecutive instances of the producer are fired one after the 
other, and the consumer is scheduled about two periods apart Unfortunately this is not 
true, as illustrated the following Figure 4.9. 
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PRODUCER A 



CONSUMER B 

Rgure 4.10. The Consumer-Producer Paradigm 

Figure 4.10 shows that even with a faster consumer (PERb ^ PERa) one cannot 
discard the possibility of having more than one, actually even three occurrences of the 
slower producer between two consecutive instances of the consumer. This finding raises 
the following additional questions; 

1) Under what conditions could that happen? 

2) Is there an upper-bound on the number of instances of producers between two 
consecutive instances of the consumer? What would it be? 

To answer these questions, analyze carefully Hgure 4.10. 

By construction: 

PERa + 2xMETa^2xPERb Eq. (1) 

and 

PERb ^ PHRa (Initial Assumption) 

By defirution of periodic operator 

O^METaSPERa 

By re-arranging Eq. (1) 

PERa 

METa^PERb- 
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To answer the second question, let us assume the situation presented in Figure 
4.11, where four instances of the producer are attempting to exist in between the same 
two instances of the consumer. 



Rgure 4.11. Seeking for an Upper-Bound 
Eq. (1) now becomes 


2 X PERa + 2 X METa ^ 2 X PERb 
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Now let METa = 0, which is the best case possible. This results in PERb > PERa. 
But then there is no solution for the set of inequalities, i.e., three is actually the upper- 
bound. 

Based on these results the following lemmas can be stated: 

Lemma 1: 

“Given a pair of operators, where one is a producer and the other is a consumer, 
and assuming that the period of the producer is bigger than the period of the consumer, 
there can exist at most three instances of produced data waiting to be consumed at any 
instant of time”. 

Lemma 2: 

“Any produced data will be consumed within at most two periods of the 
consumer”. 

Finally, these lemmas aUow the Fundamental Synchronization Theorem, that 
will be most useful in the distributed case, but that can be tq)plied as well in the 
uniprocessor case. 

Theorem 9: 

"If there exists a feasible schedule that runs without buffer overflow or loss of 
in a shared memory multiprocessor model, then there can be a distributed and totally 
independent schedule, without any kind of explicit synchronization, if the buffer size of the 
data flow streams, as well as for the sampled streams with a triggered some condition 
have a size of three." 

1. Additional Restrictions Imposed on the Timing Constraints 

Obviously, a price is paid for getting rid of the synchronization, and it is refleaed 
in a more stringent set of timing constraints for tasks. 

Looking back at Figure 4.10 it can be seen that the worst case that can happen is 
to have some data from a producer consumed after 2 x PERb - METb units of time. 

Currently, in PSDL, contrary horn the sporadic case, there is no upper-bound on 
the time an input data for a periodic operator should be consumed. So, if the consumer is 
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a periodic operator that receives data from network streams, the fact of not using 
synchronization, will not impose any additional constraints on their timing requirements. 

In the sporadic case however, the explicit upper-bound for consuming the 
incoming data is its MRT, which is assumed to be greater than or equal to the latency plus 
the MET of the consumer operator for the incoming data. Therefore, an additional 
restriction on the triggering period of a sporadic operator must be imposed when it has 
any data coming from network streams. 



Rgure 4.12. New Timing Constraints for the Sporadic Operator 
FromHgure 4.12 

2 X TPb + LATmax ^ MRTb 


or 

_MRTb LATmax 

— 

which is the new upper-bound for the triggering period of a sporadic operator. 
From Chapter HI, Section E, it is also know that TP ^ MET. Hence, 
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METb ^ TPb ^ ° 

2 2 

which is the new formula for calculating the triggering period of a ^wradic operator, 
under the no synchronizadon assumption. 

G. THE TASK ALLOCATION MODEL 

Two basic and unavoidable steps when designing distributed software systems are 
the deconqwsition of the system functions into software processes during the early stages 
of the design and, later on, the allocation of these processes to the several processors. 
Although sometimes these two steps are used interchangeably, tiiey are very different 
activities. 

Given the software requirements, the designer must first identify a set of logical 
interrelated modules and perform its functional decomposition. This can be done with the 
aid of traditional design methods, such as structured and object oriented design. For real¬ 
time systems, such decomposition will require consideration of critical timing constraints 
and may require introduction of special modules for synchronization [SW89]. 

The first major activity is partitioning, which is the mapping of these logical 
modules into a set of physical processes. The second is allocation (sometimes called 
assignment) which is the mapping of each process to one or more processors. The focus 
of this chapter is on allocation; for further reading on partitioning see Shatz and Wang 
[SW89]. 

As shall be seen, task allocation dramaticaUy complicates the already complex 
problem of distributed software design, because assigning m processes onto n processors, 
there arc n“ different possible assignments. (Dptimal allocation is a problem of exponential 
complexity, and it was proven to be NP-complcte by Mok [Mok83]. 

The key to process allocation is to establish an allocation model in terms of a cost 
function and additional ccxistraints that match the application requirements as far as logical 
and timing correctness. The goal is to minimize the cost function under the constraints. 
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Most of the cost functions found in available literature deal with performance. Others, 
such as those relating to reliability and fault-tolerance, are only now emerging [SW89]. 

The most widely used performance cost functions are: 

1) Interprocessor communication cost (IPC) which is a function of the amount of 
data transferred, the network topology and link capacity; 

2) Load balancing, which is a measure of how uniform the workload among the 
processors is. A good load balancing will maxitnize the system stabiliQr, which 
is the capability of busy hosts to receive bursty arrivals of processes without 
con^nomising their deadlines. 

3) Completion time, the total execution time including interprocessor 
communication incurred by that processor. 

The most frequent constraints found in typical real-time systems are due to 
hardware li m ita ti ons of some processors, dependence of some processes on certain 
processors, and number of available processors. 

The choice of a cost function obviously depends on the application, on the 
underlying hardware, and on several other characteristics. 

Although distributed processing seems very attractive, one should be aware of the 
saturation effect (Figure 4,13) that is sometimes forgotten by many developers. The basic 
consequence of this effect is that, contrary to expectations, the throughput doesn’t 
increase linearly as the number of processors is increased Actually, at some point (which 
can be as few as three or four processors) throughput actually starts to decrease. 
Examples of this phenomenon are documented C3m, et aL [CHL80] and by Jenny 
[Jcn77]. The decrease in throughput is due to the excessive interprocessor 
communication, which is similar to the trashing problem in the eady memory paging 
systems. 
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Figure 4.13. The Saturation Effect 

Basically, all of the different approaches to solve the allocation problem fall into 
one of the three major classification areas: graph theoretic, mathematical programming, or 
heuristic methods, which are by no means mutually exclusive. 

The first of these represents the processes to be allocated as nodes in a graph, 
where each edge has a weight that is proportional to its inter-module communication cost 
(IMC), with the following remarks: an IMG of zero means that no communication takes 
place between those two iiKxiules and an IMG of infinity means that they should be 
assigned to the same processor. If a minimal-cut algorithm is performed on the graph one 
ends up with the minimum allocation cost for those nKxiules between two processors. In 
general, however, an extension of this method to an arbitrary number of processors 
re(]uires an n-dimensional min*cut flow algorithm, which quickly becomes 
computationally intractable. 

The mathematical programming approach uses, in most cases, the non-linear 
integer programming technique, where the above problem is formulated as a set of 
equations. It is very flexible in the sense that additional constraints can be included in the 
model veiy easily, however it has two short-comings. F^st, it fails to accurately represent 
real-time constraints and precedence relations among the tasks, because both factors 
introduce queuing delays into the system in a complex manner [DSWE83]. 
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Finally, the heuristic methcxis, unlike the first two, try to find sub-optimal solutions 
for the assignment problem, which are in general faster, more extendible and simpler. 

1. Some Basic Definitions 


Defining several metrics will provide a better insight into the problem. 
Average Task MET - given n tasks, it is a lower-bound in the response time; 


METavg 


Imet 

n 


Average lx>ad Factor - it is a kind of schedulability index that shows how tight the 
system is. The bigger it is the harder is to find a schedule. It is independent of the number 
of processors, e.g., LFavg = 0.8 means that almost eveiy operator is very CPU-intensive. 
A more precise insight could be obtained by the standard deviation of the load factor. 


LFtot 


MET 

PER 


LFavo = 


LFtot 


Average Processor Load Factor - given the number of processors p, it specifies 
the ideal share of processing so that a perfect load balancing is achieved. 

P 

Maximum Processor Load Factor - it specifies the maxiraiun load factor each 
processor can sustain using the mininiuin number of processors, 

PIF 


Placement Cost Matrix • it basically shows the cost incurred when operator X is 
allocated to processor k. If some task must be placed in some specific processor, its 
placement cost should be zero. Otherwise it should be infinity. Other values reflecting the 
user s desires can also be used so that the scheduler will have more opdons when deciding 
upon the allocation. 
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Fiacem^tCost 

Processor 1 

Processor 2 

Processor 3 

Operator A 

oo 

0 

4 

Operator B 

0 

OO 

7 

Operator C 

5 

8 

5 


Table 4.3. Placement Cost Matrix 

Inter-Module Communication Cost Matrix - it basically shows the cost incurred 
when operator X wants to communicate with operator Y, or vice-versa, using the 
network. Note that it should be symmetric, since it doesn’t depend on the way the 
communication is carried out. It simply states that if those two operators are allocated in 
different processors, that will be the amount of communication they will have to exchange. 
In this case it will also account for the state streams. 



Table 4.4. IMC Cost Mattix 

Distance Cost Matrix - it takes into account the geographic distance between 
prxxiessors. For all distances within a local area network, index 1 is assumed. When not 
connected, the distance is assumed to be infinite. If passage through additional networks 
is required, there will be an increase of 0.1 for each additional level of networking. Note 
that the base purpose of this matrix is to see if the specified latencies and network delays 
are compatible with the underlying hardware architecture. 



Table 4.5. Distance Cost Matrix 
* Note that we wfll be using interchangeably the term IMC and IPC. 


















2. 


The Approach 


The first attempt was to separate tasks according to their data dependency, which 
was determined by calculating the several slices of the prototype. Informally, a slice is 
defined as the set of possible paths from a sink node (nodes with no output) to a root node 
(nodes with no input edges), i.e., a slice contains all ancestors of a sink node. For a formal 
definition see Dampier (I>am94]. Qearly, an operator can belong to more than one slice. 



Hgure 4.14. The Data Dependency Graph 


After all slices are calculated the operators that belong to the same slices are 
grouped into equivalent classes, such as Ga, Gab. Gcde etc., meaning that they belong to 
slice A, slices A and B, or slices (D, D, and E, respectively. The resulting graph is the Data 
Dependency Graph, which is shown in Rguie 4.14. The following algorithm can then be 
applied: 

1) Pick those operators that belong to two slices. At least one operator must exist 
in this equivalence class that has two edges, one for each of the slices it belongs 
to. Pick the least expensive edge, i.e., the one with the least IMC cost, and add 
the operator to this group. This may later prove to be something less than the 
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best choice, but for now it is the best option available without resorting to the 
expensive method of checking the entire slice. The final partition is illustrated 
by the dotted line in Figure 4.14, and presents a cost of 117 IMC units. To get 
rid of this problem, instead of trying to join in a bottom-up fashion, the most 
expensive edge not yet included in any group may be added, and an attempt can 
be made to unite both groups, resulting in the following partition: {Ga , Gabd , 
Gab), {Gabc , Gc), {Gde ♦ Go), {Gb} and {Ge), which has a cost of 56 IMC 
units; 

2) Keep doing this for the operators belonging to three slices, four, etc., until all 
operators have been processed. 

3) K the load factor in some set exceeds one or some specified threshold then the 
set should be split into two by recursively ^rplying the two-dimensional 
minim a l -cut algorithm, until all sets have a load factor less than one. Note that 
since the min-cut algorithm is trying to minimizfi the cost of the edge, it may 
well not be an c^timal choice for minimizing load factor. Checking for load 
factor is left until the end because the relative costs of those edges could not be 
determined prior to completing the first two steps.. 

The intended result was to have several fairly data independent sub-graphs that 
could be assigned to different processors, having a nrinimnm IPC cost, and, most 
importandy, providing a veiy nice nrodularization for the system with direct effects on 
reliability. For example, if some processor had a problem, only those modules allocated in 
that processor would fail Of course, this approach did not take into account load 
balancing, but at least provided a starting point 

Unfortunately, after running a partial implementation of this algorithm with several 
random generated prototypes, its computation cost proved to be very high and most of the 
prototypes ended up having very few slices to start with. 
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After analyzing the advantages and disadvantages of the initial attempt and several 
other alternatives, it was decided to use the inter-module commumcation cost (IMC) as 
the main cost function, without taking into consideration any data dependency. 

Now it is necessary to come up with a consistent way of assigning the IMC cost to 
each pair of operators in a PSDL graph. 

Qearly, in the PSDL context, where complex ADTs can travel through the 
streams, the amount of data transferred by a stream is variable, and its actual size can only 
be known at run-time when the actual prototype is executing. Therefore it is necessary to 
use some kind of average or normalized value, so that the deviations are diminished. 
Another asRiimp tinn to be made (it is actually already part of the PSDL model) is that 
every operator, when fired, ou^uts one and only one value per firing for each of its output 
streams. Furthermore, the worst case is assumed, where, once activated, the operator will 
always produce an output, even if the data triggering conditions or the output guards are 
not satisfied. 

The IMC cost, represented as IMC_INDEX, and the actual amount of data to be 
transmitted between two operators, denoted as IMC_PER_SEC, are calculated according 
to the algorithm described in Figme 4.15. 


for each pair of operators loop 
if parent operator is TC then 

IMC_PER_SEC := CONNECnVITy x AVG_PROC_TME x 1000 /PERIOD_PRODUCER; 
elsif parent is NTC then 

IMC.PER^SEC := CONNECTIVITY x AVG_PROC_TIME x 1000 / HARMONIC.BLOCK: 

end if; _ 

IMCJNDEX - IMC_PER_SEC /NORMALI2TD_1jOAD_FACTOR 
_ end loop; 

Figure 4.15. Algorithm for Calculating the IMC Cost Function 

Note that in order to quantify and compare IMCs it was necessary to fix the time 
window for measurement and the second was chosen. 

AVG_PROC_TIME is the estimated average time in microseconds taken for that 
system to output a typical PSDL stream to some buffer, which will be later transmitted to 


98 





the network. Note that this parameter is innocuous, since it is a constant for every stream. 
The only reason to maintain the parameter is to make the resulting index more realistic. 

CONNECTIVITY is defined as the number of streams connecting two operators 

including the state streams. 

The ratio 1000 ms/ PERIOD (ms) for the time-critical operator specifies the 
number of periods that occurs in one second, that is, the number of times the producer will 
fire. For the non-time-critical operator the HARMONIC BLOCK (HB) is used as if there 
was only one occurrence of the NTC operator in each HB. 

Finally, for the IMC_INDEX the NORMALIZED_LOAD_FACTOR is 

introduced, defined as: 

(LOAD FACTOR PARENT + LOAD_FACTOR CHILD) / MAX_LF_PER_PROC 
Note that the above formula is valid for any case except when both operators are 

NTCs. In this case the formula is changed to: 

((1.0 - MAX_LF_PER_PROQ + (1.0 - MAX_LF_PER_PROQ) / MAX_LF_PER_PROC 

or 

(2.0/ MAX_LF_PER_PROQ - 2.0 

The rational behind these formulas is that if there are two small LF operators 
connected by a stream with some IMC.PER.SEC, the IMC.INDEX or, rather, the 
relative cost for placing them in different processors should be much higher than if they 
had big load factors, for a same IMC_PER_SEC value. For streams connecting two NTC 
operators that don’t have an explicit load factor, since they don’t have periods nor METs, 
the remaining load factor wiU be used. In other words, 1.0 - TOTALJLF, as if it was the 
load factor. If the load factor is bigger than one, then there must be more than one 
processor, so that the maximum average load factor per processor is used instead, 
assuming that the minimum number of processors is available. 

Although it is not used in the current implementation, it seems to be a good idea to 
divide the remaining LF among all NTCs operators. This way it would be less costly to 
split two NTC^s, where the total load factor of the prototype is 0.8, than to split two TC 
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operators both with load factors 0.2. In the current implementation, both cases have the 
same cost. 

3. The Current Implementation 

As the very first stqj, the allocation algorithm builds a priority queue of edges in 
decreasing order of inter-module communication cost (IMC_INDEX), which were 
previously calculated. Note that it will contain all edges in the prototype and not rally 
those connecting time-critical operators. 

Once the priority queue exists, each operator is allowed to form a set by itself. 
Next a union-find algorithm is applied, so that if the origin and destination operators of the 
edge being examined belong to different sets, they are united (as long as their combined 
load factor is still under some threshold previously established by the user). 

begin ~ allocate 

— Build a priority queue of edges in decreasing order of IMC_INDEX 
BUILD_PRI_QUEUE(OOUNT); 

— Let each operator form a distinct set by itself, 
for I in l.J^EW_GRAPH_PKGARRAY_SIZE loop 
OP := NEW_GRAPH_PKGJlErURN_OP(I); 
OP_UNION_FIND_PKG.CREATE(OP_LINK(I).OP); 
end loop; 

while IMC_PRIORnT_QUEUE.NON_EMPTY(PRI_QUEUE) loop 
EDGE := IMC_PR10RITY_QUEUEJIEAD_BEST(PR1_QUEIJE); 

ROOT.A := OP_UMON_FIND_PKGJ=IND(OP_LINK~(EDGE.ORIGIN)); 

ROOT_B := OP_UNION_FIND_PKGJTND(OP_LINK (EDGEDEST)); 
if not OP_UNION_FIND_PKG.eq (ROOT.A, ROOT_B) then 

if ROOT.AXF + ROOT_B JJ S ALLOCATION.FACTOR then 

ROOT_C ;= OP_UNION_FIND_PKG.UNION(ROOT_A. ROOT_B. 

ALLOCATIOn1faCTOr3; 

end if; 
end if; 

IMC_PRIORITY_QUEUEREMOVE_BEST(PRl_QUEUE); 
end loop; 

end allocate; 

Figure 4.16. Partial View of the Allocation Program 

As can be seen, the current approach is a kind of first-fit bin-packing, where the 
size of the bin is dictated by the ALLOCATION FACTOR specified by the user. A very 





simple modification which would allow a better load balancing is to substitute the 
ALLOCATION FACTOR by the AVERAGE PROCESSOR LOAD FACTOR of the 
prototype, multiplied by some number, for example, 1.1, to allow some variation around 
the average. In doing this, it is being enforced that all processors will get an even load, 
despite of an increase in the communication cost. Other checks could be applied as well, 
such as checking the requirements or the placement cost matrix to see if the operators 
could be allocated to the same processor, or if tiiey needed to be in a specific processor. 
The slices they belong to could also be examined, so that even if the load balancing rule is 
not conq)letely satisfied they could still be assigned to the same processor if they were in 
the same slice. As can be seen, there are an enormous number of possibilities for cost 
functions. However, finding the one that best fits the application requires a great deal of 
fine tuning. 

The union-find data structure has been implemented as an in-tree, where the nodes 
can have many children, therefore, after all the sets have been formed, we need an 0(n^) 
worst case algorithm in order to retrieve their members. Another way to implement it that 
would make the retrieve operation much cheapo* is by using a double linked list, but then 
the insert operation would be a little bit more expensive. In both cases, the uiuon-find 
algorithm could be enhanced by adding path compression and balancing into the 
implementation, resulting in an CXntiog n) time algorithm, where m is the number of edges 
in the graph. 

HnaDy, the allocation algorithm outputs a set of sets, i.e., a set where each of the 
components is another set containing the nodes in that partition. Although not included in 
the current unplementation, it should ultimately output a map instead of a set, where each 
of the partitions would be mapped to a specific processor, according to the requirements. 
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V. ARCHITECTURAL ISSUES OF THE CAPS SCHEDULER 


Section A of this chapter describes several issues related to the architecture of the 
CAPS scheduler in its current uniprocessor implementation. Section B presents a novel 
architecture for dealing with the distributed scheduling case. The remaining sections of 
this chaptet contain a proposed ia^lementation, first using the current available 
technology and then using the upctmting facilities offered by Ada95. It is important to 
note, however, that while implementing the distributed system in Ada provides a uniform 
environment for building prototypes, it suffers fiom the disadvantage that tasking and the 
new distributed systems support in Ada95 are not time-bounded. Hence, in order for the 
distributed Ada prototype to satisfy the timin g constraints as specified, the average 
behavior of the underlying host operating system and the network communication sub¬ 
system must be relied upon. 

A. THE CURRENT SCHEDULER - UNIPROCESSOR ARCHITECTURE 

Currently, CAPS is a development environment, in^lemented in the form of a 
collection of tools, that are linked together by a user interface. The prototyping process is 
accomplished by running several tools independently, one after the other, so that their 
output taken together make up the final Ada program, which will iiiq)lement the 
supervisory control of the prototype. 

More specifically, the translator converts the PSDL program defined by the user 
into compilable Ada units. During this process, it creates the following five major 
packages: exceptions, instantiations, timers, streams, and drivers, all preceded by the name 
of the prototype followed by an underscore. Ultimately each of these will become pan of 
the prototype supervisory Ada program. 

The first three of these packages contain all of the user declared exceptions, 
generic packages and timer instantiations defined in the PSDL program. The package 
streams contains the instantiations of all the streams used by the prototype, which are 
implemented as Ada generic tasks contained in the generic package PSDL_STREAMS, 
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which contains all stream types supported by PSDL. A partial view of the supervisory 
program for the Patriot Missile prototype is shown in Figtue 5.1. 

package PATRIOT.EXCEPnONS is 
~ PSDL excqstion type declaration 

type PSDL_EXCEPnON is (UNDECLARED.ADA^EXCEPTION); 
end PATRIOT^EXCEFnONS; 

package PATRIOT.INSTANTIATIONS is 
— Ada Generic package instantiations 
end PATOIOT.INSTANTIATIONS; 

withPSDL.TlMERS; 
package PATHIOT.IIMERS is 
» Timer instantiations 
end PATRIOT^TIMERS; 


— with/use clauses for atomic type packages 

— with/use clauses for generated packages. 

with PATRIOT.EXCEPnONS; use PATRIOT.EXCEPTIONS; 
with PATRIOT.INSTANTIATIONS; use PATlUOT.INSTANTIA'nONS; 

— with/use clauses for CAPS library packages, 
with PSDL.STREAMS; use PSDL.STREAMS; 

package PATRIOT.STREAMS is 

— Local stream instantiations 

package DSJNTERCEPT.ANGLE.CONTROL.PATTUOT is new 
PSDL.STREAMSJTF03UFFER(FL0AT); 
package DS.LAUNCH.ANGLE.LAUNCH.PATRIOT is new 
PSDL.STREAMSJIFOJUFFER(FLOAT); 
package DS.LAUNCH.STATUS.SCUD.RADAR is new 
PSDL_STREAMS.SAMPLED.BUFFER(LAUNCH.STATUS.RECORD); 
package DS.LAUNCH.STATUS.DISPLAY.SCUD is new 
PSDL.STREAMS.SAMPLEDJUFFER(LAUNCH.STATUS.RECORD); 
package DS.LAUNCHER.POSmON.SCUD.RADAR is new 
PSDL.STREAMS.SAMPLED.BUFF^(FLOAT); 
package DS J4ISSILE.TRACK.CHECK.THREAT is new 
PSDL.STREAMS.SAMPLED.BUFFER(TRACK); 
package DS.SCUD.STATUS.DISPLAY.SCUD is new 
PSDL.STKEAMS.SAMPLED_BUFFER(MISSILE.STATUS); 
package DS.SCUD.TRACK.DISPLAY.SCUD is new 
PSDL.STREAMS.SAMPLED.BUFFER(TRACK); 
package DSJTACnCAL.STATUS.DISPLAYjrACTICAL is new 
PSD1^STREAMS.SAMPLED3UFFER(MISSILE.STATUS.REC0RD); 
package DS.TARGET.RANGE.CONTROL.PATRIOT is new 
PSDL_STREAMSJTF0JUFFER(FL0AT); 

— State scream instantiations 
end PATRIOT.STREAMS; 


Figure 5.1. Partial View of PatrioLa 
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Currendy, CAPS implementation supports only the sampled streams where data 
can always be written and read, the state streams, which are basically a sampled stream 
with an initial value, and the data flow streams, which are implemented as a FIFO buffer 
with giye one. The streams are implemented as individual Ada tasks with entries such as 
READ, WRITE and CHECK, whose inqilementation will vary according to the type of 
stream. 

HnaUy, the package drivers basically ccaitains all of the data declarations, the data 
trigger checks that control whether a stream should or should not be read, the execution 
trigger checks that decide whether or not to fire the operator, and the output guard 
checks, which will allow whether or not an output is to be written to the output streams. 
Each of these checks are implemented in die following way: 

1. Data Triggers 

If an operator has no triggering condition at all, its input streams will be read 
whenever the operator is fired, but they will never generate any overflow or underflow 
exceptions. Similar situation happens when the sueams are state streams. 

If at least one of the incoming streams is a TRIGGERED BY SOME sampled 
stream, then the streams will be read whenever one or more of the streams in the 
TRIGGERED BY SOME set has new data, but again, they will never generate an 
underflow exception. Because of this, care must be taken with respect to the very first 
reading of data fiom sampled streams, since garbage may be consumed. 
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Figure 5.2. TRIGGERED BY SOME Implementation 


If at least one of the incoming streams is a data flow stream, in other words, has a 
TRIGGERED BY ALL condition, the streams will only be read if the data flow stream 
has a new value in its buffer, and any attempt to read an old value from a data flow 
stream, will generate an underflow exception. As shown in Figures 5.2 and 5.3, the read 

operation is actuaUy a call to rendezvous with the READ entry of the incoming stream 
task. 
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Hgure 5.3. TRIGGERED BY ALL Implementation 


2. Execution Triggers 

The execution trigger is where the actual program that implements the 
functionality of that operator, which is provided by the user, will be called if the conditions 
are satisfied. These conditions come from the TRIGGERING IF pan of the PSDL 
program. Note that even if they are not satisfied, the data has already been consumed, and 
is therefore marked as old data. 
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Figure 5.4. TRIGGERING IF Implementation 



3. Output Guards 

Finally, the output guards are checked. If the conditions are satisfied, a 
rendezvous with the output stream tasks is requested by calling their WRITE entry. 



Hgure 5.5. Output Guards Implementation 
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Besides these packages that are generated by the Translator, there are another two 
packages generated by the Static Scheduler and by the Dynamic Scheduler. When 
consolidated by one of the CAPS scripts, they will form the so called prototype 
supervisory program, receiving the name of the prototype followed by a ".a" extension, 
which stands for an Ada program. 


CAPS 

Support Packages 


Dynamic Schedule 
Task Package 


Static Schedule 
Task Package 


procedure prototype_name is 
begin 

init_haidware_modcl: Main Program 

Stan static schedule; 

Stan dynamic schedule; 

tuAprototype_name ;_ 


Exception Declarations 
Generic Instantiations 
Timer Instantiations 
Data Stream Instantiations 
OpCTator Drivers 


while true loop 

call non-time-critical operator drivers; 
end loop; 


while truekxq) 

call time-critical operator drivers; 
end loop; 


Figure 5.6. CAPS Supervisory Program Structure 


CAPS is composed of four major Ada tasks with the foUowing priorities, as 

deHned in the package PRIORITY_DEFINmONS: 

1) Debugger Task - it handles all CAPS debugging tools used during prototype 
execution, and has the highest priority within CAPS, which is 4 

2) Stream Tasks - each stream is implemented as one Ada task with priority 3 
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3) Static Scheduler Task - it is responsible for calling all the timing critical 
operators, accoiding to the static schedule. The TC operators will be called in 
a non-preemptive way, so that each instance of an operator will execute to 
completion; being preen: 5 )ted only by the debugger task, or during operations 
with the stream tasks. It has a priority of 2. Note that, although the stream 
tasks have higher priority, they are called (synchronized) by this task, so that 
there will be no problems such as another stream from another operator trying 
to gain control of the CPU. 

4) Dynamic Scheduler Task - it is assigned the lowest priority (priority 1) within 
CAPS, and it handles all the non-time critical operators of the prototype. They 
will run in a pre-defined order established by the dynamic schedule, whenever 
there is idle time in the static schedule. The NTC operators, due to their low 
priority, can be preempted by any other task and, as a matter of fact, they are 
not even guaranteed to run at all. This problem of unbounded blocking will be 
addressed later on. 

B. THE PROPOSED DISTRIBUTED ARCHITECTURE 

In the uniprocessor case, the translator had no information about the output of the 
scheduler. For the distributed case, however, this information is crucial, since it will have 
to generate different Ada units for each of the processors involved in the prototyping. 

Once the scheduler has generated the different partitions, defining which operator 
belongs to which partition, the translator will have to be ca ll ed, so that it can generate as 
man y supervisory files as the number of partitions. It is suggested that the prototype name 
followed by the partition number be used as the naming convention for the supervisory 
files, c.g. PATRIOT.ha. PATRIOT_2.a, and so on. 

The following information should be passed by the scheduler to the translator, so 

that it can perform its job: 

1) Number of partitions and a list with the operator names belonging to each 
partition 


no 



2) Mapping from partitions to processors according to the requirements 

For the sake of simplicity, it is assumed that there is a homogenous cluster of 
processors, so that a configuration of partitions is not needed. The process of mapping 
the partitions of a program to the nodes in a distributed system is called configuring the 
partitions. Note, however, that even after having abolished condition 2, there is still a 
need to provide the translator with the name of the processors. It is suggested that this 
information come from the CAPS interface. 

Once this information is available to the translator, it should generate a supervisory 
file for each partition, exactly as it did for the uniprocessor case, except for the following 
differences: 

1) In the new package streams, where the streams are instantiated, if a specific 
stream is going to some operator external to that partition, and only in that 
case, it should be hard-coded as an instantiation of a special and newly created 
kind of stream, i.e., the network stream. Note that this stream has only one 
entry, which is writejsxternal, considering that all reads will be to local 
streams. Certainly, the package PSDL_STREAMS will have to be totally 
changed to conform with the new model for distributed scheduling without 
synchronization, which requires a buffer size of three for the network streams. 
Another modification made in this package relates to the sampled streams, 
which are now divided into two groups, non-triggering (NT) and TRIGGERED 
BY SOME (TBS), since they have quite different semantic behaviors. Figure 
5.7 shows the specification of the new package containing the stream tasks. 
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with PRIORITY^DEFINmONS; 
use PRIORTTY^DEFINmONS; 
packa ge PSD L_STREAMS is 

BUFFER_OVERFLOW: exceptkm; 
BUFFER_UNDERFLOW: exceptioii; 

~ Impleinents a buffer with size 1, for sampled 

- streams with no triggering condition (Nl) 
generic 

type ELEMENT_TYPE is private; 
package NT_S AMPLED.BUFFER is 
ta^ BUFFER is 

pragma PRIORITY(BUFFER_PRIORITY); 
entry READ(VALUE: out ELEMENT 
entr y WRn E(VALUE: in ELEMENT.TYPE); 
end BUFFER; 

OMl NT.SAMPLED.BUFFER; 

~ hnpiements a buffer with size 3, for sanq)Ied 

- streams that have triggering "BY SOME" 

~ condition (TBS) 

generic 

type ELEMENT_TYPE is private; 
p^age TBS_S AMPLED^BUFFER is 
tadt BUFFER is 

pragma PRIORITY(BUFFER.PRIORny); 
entry CHECK(NEW^DATA: out BOOLEAN); 
entry READ(VALUE: out ELEMENT.TYPE); 
entr y WRITC (VALUE: in ELEMENT.TYPE); 
end BUFFER; 

function NEW^DATA return BOCX-EAN; 
end TBS.SAMPLED3UFFER; 

^ Implcmenu a buffer with size 1, for state streams 

- that have no triggering condition (NT) 
generic 

type ELEMENT^TYPE is privatr, 
INITIAL^VALUE: ELE MENTJ TYPE; 
padcage NT^STATE.BUFFERb 
tadtBUFFERis 

pragma PRIORmr(BUFFER.PRIORmO; 
entry READ(VALUE: out ELEMENT.TYPE); 
entr y WRn E(VALUE: mELEMENT3^E); 
end BUFFER; 
end NT^STATE.BUFFER; 


~ Implements a buffer with size 3, for states streams 

- that have triggering "BY SOME" condition (TBS) 
generic 

type ELEMENTJTYPE is private; 

INTITALjVALUE: ELE MENTJT YPE; 
paduge TBSjSTATEjBUFFERls 
taskBUFFERis 

pragma PRIORITY(BUFFERjPRIORrrY); 
entry CHECK(NEWjDATA: out BOOLEAN); 
entry READ(VALUE: out ELEMENT TYPE); 
entr y WRn E(VALUE: inELEMENTjTYPE); 
end BUFFER; 

function NEWjDATA return BOOLEAS; 
end TBSjSTATEjBUFFER; 

Implements a buffer with size 3, for dataflow 
^ streams, that is, those that have the triggering 

- "BY ALL" condition 
generic 

type ELEMENTjTYPE is private; 
padtage FIPO_BUFFER Is 
ta^BUFFl^is 

pragma PRIORrrY(BUFFER_PRIORrrY): 
entry CHEaC(NEW_DATA: out BOOLEAN); 
entry WRITE(VALUE: in ELEMENT TYPE); 
entr y READ (VALUE: out ELEMENTjTYPE); 
end BUFFER; 

function N EWjDA TA return BO<XEAN; 
end FIPOjBUFFER; 

»In^enients a buffer with size 1, for networked 

- stream, no matter what kind of streams they are 
with AjSTRINGS; use A.STRINGS; 

with ADAjSTREAMS; 

wtthSYSTEMjRPC; 

generic 

type ELEMENTjTYPE is private; 

PROC: SYSTEMjRPCPARTmON.ID; 
STREAMjNAME: in AjSTRING; 
package NETWORKjBUFFER is 
ta* BUFFER b 

pragma PRIORITY(BUFFER^PRIORITY); 
entry WRITEjEXTERNAL( 

VALUE: in ELEMENT TYPE; 
PROCrinSYSTEMjRPCPARTmON ID; 
STR EAMjNAME: in A.STRING); 
end BUFFER; 
end NETWORKjBUFFER; 
end PSDLjSTREAMS; 


Hguir 5.7, The New PSDL^Sireanis Ada Package Specification 


2) The new drivers package should contain only the driver procedures related to 
the operators belonging to that partition. It is very important to notice that the 
distributed scheduling model assumes that a stream resides, i.e., it is 


112 





instantiated, in the same processor or partition of its consumer operator/ 
Therefore, for the consumer operator, it is irrelevant where the data came from, 
and, furthermore, no changes will be needed for the individual driver 
procedures within this package, since all the reads will be to local streams. The 
only change would occur if it was necessary to perform a write to an external 
operator. In this case, the write operation should be hard-coded by the 
tr ansla tor as a call to writejBXternal, an entry of the special network stream 
task. In Figure 5.8, which presents the network stream task body, it is apparent 
that, after this rendezvous is accepted, there should be a call to some inter¬ 
processor co mmunic ation routine, e.g., DO_APC, that would deliver the 
message. It is also at this point where most of the problems are going to 
appear, as shall be seen. 


with A.SnUNGS; use A.STRINGS 
with ADA_STREAMS; 
with SYSTEM_RPC; 
package body NETWORK_BUFFER is 
task body BUFFER is 

PARAMETERS: SYSTEM_RPC.PARAMS_STREAM_TYPE(3); 

— This type allows multiple stream elements within the 
— same stream, depending on its declaration 
begin 
loop 

accept WRITE_EXTERNAL(VALUE; in ELEMENT.TYPE; 

PROCESSOR : in SYSTCM_RPCPARTmON_ID; 

STREAM.NAME: in A.STRING) do 

SYSTEM_RPCDO_APC(PROCESSORJ»ARAMETERS); 

— parameters will include the remote procedure name, 

- the psdl_stream_name and value 
end WRTTE.EXTERNAL; 

end loop; 
end BUFFER; 
end NETWORK_BUFFER; 


Figure 5.8. Body of the Network Stream Task 



* This assumption will require that all excq)tions horn external streams should be treated and 
consequently hard-coded in the consumer’s side. 




The changes made so far are very minor, since most of the burden is being put on 
the write operation to external streams. In fact, the most difficult part of this 
implementation is finding a way to receive the incoming messages from the different 
processors and operators. Some kind of commumcations server, that wiU have the duty of 
receiving anti routing all the incoming messages to its final destination, will be needed. 
Due to the semantics of PSDL, in order to reliably implement this communication, it will 
be necessary to send some kind of header containing the consumer operator, the name of 
the stream and the name of the destination processor along with the data. 

These requirements for the header come from situations such as when the same 
operator is trying to write to the same stream into diffoent operators in different 
partitions. This case is illustrated in Figure 5.9. In the next section the different options 
available for implementing this communication sub-system are described. 



Hgure 5.9. Justification for the Header Information 

C. IMPLEMENTATION ISSUES OF THE COMMUNICATION SUBSYSTEM 

One of the most iriqx)rtant design issues is the choice of the communication 
subsystem. It is recommended to use the remote procedure call (RPC) paradigm as 
opposed to the traditional message passing mechanism. The reasons for this choice is that 
RPC is widely implemented for interprocess communication between computers across a 
network, being supponed by most of the emerging distributed operating systems. Several 
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standards have been initiated by organizations, such as ISO and OSF. This method also 
provides an asynchronous form, relaxing the original synchronous semantics of RPC. 
Finally, the Annex E (Distributed Systems) of the Ada95 Reference Manual makes it the 
choice, though not mandatory, for future implementations of this Annex.[Ada95] 

1. The RPC Model 

The remote procedure call model is similar to the local procedure call model. In 
the local case, the caller places arguments to a procedure in some well-specified location. 
It then transfers control to the procedure, and eventually gains back control. At that 
point, the results of the procedure are extracted from the well-specified location and the 
caller continues execution.[Sun90] 

The remote procedure call is similar. That is, the caller process sends a call 
message to the server process and waits (blocks) for a reply message. The call message 
contains the procedure's parameters, among other things. The rq)ly message contains the 
procedure’s results, among other things. Once the rqily message is received, the results of 
the procedure are extracted, and the caller's execution is resumed.[Sun90] 

Note that in this noodel, only one of the two processes is active at any given time. 
The RPC protocol, however, makes no restriction if the implementation allows the calling 
routine to do some useful work while waiting for the reply (asynchronous mode). 

2. The First Approach 

The first idea was to implement the RPC paradigm ly using the standard RPC 
libraries. However, in order to do that within CAPS, it would be necessary to call from 
inside an Ada task, more specifically from inside the network tasks, a C routine that would 
implement the RPC calls (see Rgure 5.8). The reason for a C routine is that there is no 
library suppon or existing bindings for implementing RPC from inside Ada83. It would 
not be difficult to write an Ada wrapper to the C routine. However, the biggest problem 
to be dealt with is how to pass the Ada parameters to the C routine, which could be very 
complicated abstract data types from the PSDL prototype. Assuming that this problem 
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could somehow be solved, there is an additional problem: How could this C routine pass 
the complex ADTs through the streams? In the Unix/C world, there currently exists a 
great deal of support for these kinds of operations. 

For exanqjle, the rcpgen utility is basically a compiler that accepts a remote 
program int^face definition written in the RP*C language, which is very similar to C, and 
outputs a C program, containing all the client routines, the server routine, and most 
importantly, all the XDR filter routines. An XDR routine converts procedure arguments 
and results in the network format (sequential streams) and vice-versa. 

The External Data Representation (XDR) standard comprises a set of library 
routines that allow a C programmer to describe arbitrary data structures in a machine- 
independent fashion. XDR is the backbone of Sun's RPC package, in the sense that data 
for remote procedure calls is transmitted using this standard. It was designed to work 
across different languages, operating systems, and machine architectures. 

It is irrqxrrtant to note, however, that most of the time required to prepare a data 
structure for transfer is not spent in conversion but in traversing the elements of the data 
structure. To transmit a tree, for example, each leaf must be visited and each element in a 
leaf record must be copied to a buffer and aligned there. Storage for the leaf may have to 
be deallocated after the data is sent Similarly, to receive a tree, storage must be allocated 
for each leaf, data must be moved fiom the buffer to the leaf and properly aligned, and 
pointers must be constructed to link the leaves together. [Sun90] 

In this case what is needed is a remote procedure called receive, ruiming in all the 
machines, ready to intercept any incoming messages, and another routine, namely send, 
that will also run in all machines and will remotely call the receive routine. In Figure 5.10 
both routines which were successfully tested in the “C’ environment are presented. Note 
that the send routine is not sending anything, but merely passing parameters to the remote 
procedure receive. 
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RPC_REC.C 

i main(argc, argv) 

/* receivenc - remote procedures; called by server 

int argc; 

stub. */ 

char *argvD; 

{ 

CLIENT *cl; /* RPC handle */ 

#include <stdio.h> 

/* Standard RPC include file V 

char *receiver_name: 

#include <ipc/rpc.li> 

char **status; 

t* this file is generated by rpcgen •/ 

#include "RPC_reccive.h" 

char *message; 

if (aigc != 3) { 

!* Receive a siring of chars and reply with a status 

fprintf(stdeiT. "usage: %s hosmame 

V 

messageXn", argv[0]); 

exit(l); 

char ** 

) 

receive_l(message) 

rcceiver_name = argv[l]; 

char message; 

message = argv[2]; 

{ 

/* Create the client "handle" */ 

static charstanis[20] = "OK"; 

if ( (cl = clnt_cieate(receiver_name. 

static charptrflOO]; 

DISTR_SCHEDULE, CAPS95. "udp")) 

static char*ptrl; 

= NULL) { 

/* Can't establish connection with receiver */ 

printfCReceived message = %sVn", 'Message); 

clnt_pcreateenor(receiver_name); 

fnush(stdout); 

exit(2): 

ptrl = &status[0]; 
strcpy (ptr,*message); 

) 

ptrl = &ptr[0]; •/ 

/* call the remote procedure "receive_l" */ 

retum(&ptrl); 

printf("Message to be transmited = %s\n". 

} 

message); 

fflush(stdout); 

if ((status = ieceivc_l(&message, cl)) = 

RPC_SEND.C 

NULL){ 

r RPC_send.c - client program for remote receive 

clnt_penor(cl, receivcr_name); 

service.*/ 

cxit(3); 

} 

printfCStatus fixmi remote receiver %s is 

#include <string.h> 

#include <stdio.h> 

%^sn", receiver_name,*status); 

/• standard RPC include file */ 

clnt_destroy(cl); /• done with the handle */ 

#include <rpcApc.h> 

cxit(0); 

/* this file is generated by ipcgen •/ 

#include "RPC_receiveJi" | 

) 


Figure 5.10. The RPC Programs for the New Scheduler 


Finally, if both problems have been solved, i.e., the parameter passing between C 
and Ada in the sender side and the Ada bindings for the XDR routines, there is grill an 
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additional problem in the receiver side due to the way RPC is now implemented in C. The 
receiving, or the server, routine, is implemented as a forever loop by calling the Unix 
system call svc_run(). To overcome this problem one would need to be able to call an 
Ada procedure from inside a C routine, and again the same problem of passing parameters 
would be present 

Another approach, such as using files to exchange data between C and Ada, could 
be used, but then other problems, such as file locking, and internal synchronization 
between C and Ada tasking (so that no data could be overwritten before being consumed) 
would come into play. 

Because of all these problems, it seems that a better solution is needed, and just 
such a solution is present in the Ada95 in:q)lementation, which will be described next 

3. The Ada95 Approach 

Annex E defines facilities for supporting the irrqjlementation of distributed systems 
using multiple partitions working cooperatively as part of a single Ada program. These 
facilities include pragmas for categorizing library units according to the role they play in 
the distributed system, such as Shared_Passive, Remote_Typcs and 
Remotc_CaD_Interface, and other mechaiusms for supporting communication and access 
to shared data. [Ada95] 

The Partition Communication Subsystem (PCS), as defined in Annex E, provides 
facilities for supporting communication between the active partitions of a distributed 
program by using the remote procedure call interface (RPC). The aimex also proposes a 
specification for the RPC interface between active partitions within the PCS, which will be 
contained in the package System.RPC. Figure 5.11 introduces the proposed specification 
for the package System.RPC 
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with Ada.Streams; 
package System.RPC is 

type Paitition_ID is range 0.. implementation-defined 
Con[imunication_Error: exception; 

type Params_Streain_Type (hutial_size: Ada.Streams.Stream_Element_Count) is new 

Ada.StreainsJloot_Streain_Type with private; 

procedure Read(Streain: in out Params_Streain_Type; 

Item; out Ada.Streains.Stream_Element_Anay; 

Last: out Ada.Slie3ms.Stieam_Element_Offset); 

procedure Write(Stream: in out Params_Stream_Type; 

Item: in Ada.Stieams^tream_Element_Anay): 

— Synchronous call 

procedure Do_RPC(Paitition: in Partition_ID; 

Params: access Params_Stieam_Type 
Result: access Paiams_Sueam_Type); 

~ Asynchronous call 

procedure Do_APC(Partition: in PartitionJD; 

Params: access Params_Stream_Type); 

— The handler for incoming RPCs 

type RPC_Receiver is acess procedure(Params: access Paiams_Stream_Type 

Result: access Params_Stream_Type); 
procedure Establish RPC_Receivcr(Receiver: in RPC_Receiver); 

private 

— not specified by the language 
end System.RPC; 

Hgure 5.11. Package System.RPC (Specification) 

As noted in Hguie 5.11, during the execudon of a remote subprogram call, most 
of the parameters (and later results, if any) are passed using a stream oriented 
representation which is suitable for transmission between partitions. The annex calls this 
action marshalling. Unmarshalling is the reverse action of reconstructing the parameters 
or results from the stream-oriented representation. Note that there is not any defined 
standard for transformation, but nevertheless the XDR standard seems to be the choice for 
most of the Ada compiler vendors. 
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The type Partition_ID is used to identify a partition, and Params_Stream_Type is 
used for identifying the particular remote subprogram that is being called, as well as 
marshalling and unmarshalling the parameters or result of a remote subprogram call, as 
part of sending them between partitions. The Read and Write procedures override the 
corresponding abstract operations for the type Params_Stream_Type. 

Both synchronous and asynchronous communication are supported, and are 
implemented by the procedures Do_RPC and Do_APC, respectively. Both procedures 
send a message to the active partition identified by the Partition parameter. The first one 
blocks the calling task until a reply message comes fiom the called partition, or some error 
is detected Ity the PCS, in which case Communication_Error is raised at the point of the 
call to Do_RPC. Do_APC operates in the same way as Do_RPC, except that it is allowed 
to return immediately after sending the message. 

finally, the procedure Establish_RP*C_Receiver is called only once, immediately 
after elaborating the library units of an active partition, but prior to invoking the main 
subprogram, if any. The Receiver parameter designates an implementation-provided 
procedure called the RPC_Receiver which will handle all RPCs received by the partition. 
Establish_RPC_Receiver saves a reference to the RPC-receiver. When a message is 
received at the called partition, the RPC-receiver is called with the Params stream 
containing the message. When the RPC-receiver returns, the contents of the stream 
designated by Result is placed in a message and sent back to the calling partition. 

The implementation of the RPC-receiver shall be reentrant, thereby allowing 
concurrent calls on it from the PCS to service concurrent remote subprogram calls into the 
partition. 

a. The Package Streams 

A Stream is a sequence of elements comprising values from possibly 
different types, and allowing sequential access to these values. A stream type is a type in 
the class whose root type is Streams. Root_Strcam_Type. [Ada95] 
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The types in this class represent different kinds of streams. The pre-defined 

stream-oriented attributes like TRead and TWiite make dispatching calls on the Read and 

Write procedures of the Root_Stream_Type. 

package Ada.Streains is 
pragma PurefStieams); 

type Root_Stream_Type is abstract tagged limited private; 
type Stream.Element is mod implementation-defined; 
type Streaffl_Element_0£Fset is range in^lementation-defined; 
subtype Stream_Eiement_Count is 

Slieain_Element_Offset range 0.. Stream.ElanenLOffsefLast; 
type Stream_Element_Anay is 

array(StreamJEiement_Offset range o) of Stream.Element; 

procedure ReadfStream: in out Root_Stream_Type: 

Item; out Stieam_Elemait_Anay; 

Last: out Slream_Element_01fset) is abstract; 

procedure WritefStream: in out Root_Stream_Type; 

Item; in Stream.Element.Airay) is abstract; 

private 

- not specified by the language 
end Ada-Streams; 


Hgure 5.12. Package Ada.Streams (Specification) 

Read operations transfer Itemljength stream elements from the specified 
stream to fill the array Item. The index of the last stream element transferred is returned in 
Last. Last is less than ItemTast only if the end of the stream is reached. 

The Write operation appends Item to the specified stream. There are also 
the Read, Write, Output and Input attributes that convert values to a stream of elements 
and reconsmict values from a stream. 

For every subtype S of a type T, some attributes are defined, which denote 
either a procedure or a function call. Figure 5.13 presents such attributes. 
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- writes the value of Item to Stream 

procedure SWritefStream: access Ada.StFeams. Root_Stream_Type'Class; 
Item: T); 

~ reads the value of Item from Stream 

procedure SHeadfStream: access Ada.Streams. Root_Stream_Type'Class; 
Item: out T); 

- writes the value of Item to Stream, including any bounds or discriminants 
procedure S*Ouq>ut(Stream; access Ada.Streams. Root_Stream_Type'Gass; 

Item: I); 

~ reads and returns the value of Item from Stream, using any bounds or 

- discriminants written by a cotrespmiding S'Oulput 

function SlnpuKStream: access Ada.Streams. Root_Stream_Type'Qass; 
return 1); 


Figure 5.13. Stream Attributes 
b. Conclusions 

All of the problems that have been discussed in this section have been 
addressed in the Ada95 implementadon. Therefore, in order to implement the distributed 
scheduling model, it is only necessary to follow the directions introduced in Section B. It 
is now apparent that the exan^le given in Figure 5.8 had already considered the packages 
(System_RPC and Ada_Strcams) and procedures (DO_APC) to be introduced with 
Ada95. The only part that is not yet clear, because it is dependent upon implementation, 
is the marshalling and uiunarshaling operations, which will affect the manner in which the 
Ada stream is constructed horn the parameters passed during the rendezvous with the 
write_extemal entry of the network stream task. 

Hgure 5.14 presents a pictorial view of the proposed architecture for the 
new Distributed CAPS Scheduler. 
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Figure 5.14. Architecture for the Distributed CAPS Scheduler 
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D. CPU SPEED RATIO ISSUES IN A PROTOTYPING 
ENVIRONMENT 

In a software prototyping environment, where the host machines usually used for 
prototyping are not similar to the intended target machines (which may not even be known 
a priori), special attention must be taken so that erroneous conclusions due to timing 
problems during the prototyping are avoided. 

There are two kinds of timing errors that can be foreseen in a real-time system. 
Both of them may cause undesirable system behavior, such as deadlocks, buffer overflows, 
or data inconsistency. The first kind of error has a relative nature, since it is caused by 
computational events that occur in an improper sequence. They arc solely dependent on 
the relative order in which the computations occur, and can be avoided by proper 
scheduling of the system (Mok83]. 

The second kind of error is more subtle, in the sense that it is caused by violation 
of some specified timing constraints, such as missing deadlines. In CAPS, since a static 
schedule is used to execute the prototype, this problem can only happen if the MET was 
inaccurately speafied, or if the MET was specified for running in a faster machine. What, 
then, is the real meaning of the MET? Is it an absolute value, or is it dependent upon the 
machine in which the module is running? Qearly, this is only the tip of an iceberg, and the 
answer is no, it cannot be absolute, since the attribute execution time.is a function of the 
machine throughput. A module that has an MET of 150 ms for some specific machine 
may take longer than that to execute if running in a slower machine. 

The problem is even bigger if the CAPS Software Base, which is supposed to be a 
collection of reusable components provided by different vendors, is taken into account 
Each component should have a PSDL specification, with all the timing constraints, such as 
MET, MRT, MCP, etc. All of this information will be used during the execution phase of 
the prototype, in trying to match needs with the available components. The same problem 
arises regarding their timing reference, since each vendor may well have their own. 
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This discussion demonstrates the imperative need for assuming a common timing 
reference within CAPS. It can be anything, as long as it is consistent and used throughout 
the prototype. Care must be taken when choosing this reference, however, since it may 
lead to significant differences when dealing with reusable components from different 
sources. 

1. Choosing a Reference 

Standard measures of performance provide a baas for comparison, and time is the 
best way to measure computer performance. The computer that perfonns the same 
amount of work in the least time is the fastest A number of popular measures have been 
adopted in the quest for a standard measure of computer performance, but most of them 
were forced into a service for which they were never intended. [HP90] 

The MIPS, million instructions per second, is easily understood by a customer, in 
that faster machines means bigger MIPS. However, the MIPS measure presents the 
following problems: 

1) MIPS is dependent on the instruction set, making it difficult to con:q)are 
machines with different instruction sets 

2) MIPS varies between programs on the same conq>uter 

3) MIPS can vary inversely to performance 

A classic example to the third of these points is the MIPS rating of a machine with 
optional floating-point hardware. If it uses the hardware floating-point unit it will take 
less time to execute, but it will also execute fewer and more conqrlex instructions. 
Software floating-point executes more but sinpler instructions, resulting in a higher MIPS 
rating [HP90]. 

Another popular alternative is million floating-point operations per second, 
abbreviated as MFLOPS. However, MFLOPS is, clearly, highly dejrendent on the 
machine and on the program. 
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Other options are synthetic benchmarks, such as Whetstone and Dhiystone, but the 
best choice appears to be to use real programs, such as compilers, text editors, CAD tools, 
etc., which have inputs, outputs, and other user-defined options. [HP90] 

While having a standard of performance for computers is still beyond the horizon, 
for prototyping purposes within CAPS, where many of the figures are still subject to 
change during the prototype refinement process, any of these metrics provides a good 
starting place. Again, for the sake of simplicity, the MIPS rating will be the reference 
model for performance in this woric 


2. CAPS Timing Model 

It will be useful to define some of the terms used in construction of the model: 

CAPS Reference —Specifies the MIPS rating of a hy]x>thetical machine, to which 
all of the CAPS timing information should be normalized. 

HOST Reference - Specifies the MIPS rating of the host machine where CAPS is 
installed. This value wiU be automatically generated by CAPS at the start of the session, 
and it is the result of an Unix system call. 

TARGET Reference - Specifies the MIPS rating of the target machine. In the 
absence of this value, it is assumed that the host machine for CAPS is identical to the 
target machine. This value should be provided by the user at the beginning of the design 
of the prototype, and will affect the retrieval of reusable components fiom the Software 
Base. 


CPU Speed Ratio ~ Specifies the MIPS ratio between the target and the host 
machine. It can be changed by the user to make temporary simulations and to overcome 
possible timing errors. It is important to note that this value wiD have a very important 
role in debugging possible timing errors during prototype execution. Its default value is 
given by the fonnula: 


CPUSpeedRatio = 


Target Reference 
Host Reference 
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Table 5.1 specifies the default values which will used throughout this discussion, 
unless otherwise stated. 


l&feraice 

Target 

Refoeii^ 

Keference 

CT Speed 
R^o 


10 MIPS 

20 MIPS 

15 MIPS 

1.33 


Table 5.1. Default Values for the Timing Model 

a. Buiiding the Prototype 

All timing infcnmation, such as MET, PER, FW, MRT, MCP, LAT and 
MOP, specified by the user during the design phase of the prototype, which in most cases 
come from the Requirements Document, are assumed to be referenced or normalized to 
the Target Reference. Therefore, when, for exanq)le, defining an operator with MET = 
100ms, it should be understood that 100ms would be the maximum execution time 
allowed for that operator if running in the target machine. It will default to the host 
machine if the Target Reference is not given. 

Note that the MET of this operator is equivalent to 200ms with respect to 
the CAPS Reference; it is this value of 200ms that will be used in the query to the 
Software Base during the search for a matching reusable component Observe also that 
this value will not affect Translation nor Scheduling, since all timing information is 
consistently and linearly normalized to the CAPS Reference. 

b. Installing Components in the Software Base 

When getting reusable components from a specific vendor or supplier, the 
timing reference used to classify their components should be specified along with the 
component For exaiiqjle, v^en a component arrives, it should be labeled as follows: 
component X has a certified MET of lOOms under a 5 MIPS machine. 

This infonnation will allow the insertion of the component into the 
Software Base as a component with MET equal to 50ms, which is the correct value 
normalized with respect to the CAPS Reference. Note that this value will be used during 
its retrieval from the software base by the prototypes. 
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3. 


Relations between CPU Speed Ratio and Timing Errors 


Assuming that all timing infcHmation from the reusable components is correct with 
respect to the supplier’s reference, then there should be no timing errors, if the component 
matches the prototype specification. For example: 

Suppose that a component with an MET of 120ms is needed. Then the correct 
query to be performed on the Software Base should be for a component with an MET of 
240ms, i.e., 

METcaps = METtarget x ^ 

CAPoref 

Therefore, using this component in the prototype, according to the generated static 
schedule, should not cause any timing errors. However, if it does cause a timing error, 
then it is possible to conclude that the component timing information was incorrect. To 
solve this problem, the following steps can be taken: 

a) Increase the CPU Speed Ratio until the error disappears. This means that a 
reasonable MET for that com|x>nent with respect to the Target reference, although it may 
not be the tightest one, is equal to: 


New MET- 

T«gei 


New CPU Speed Ratio 

Old CPU Speed Ratio O^g^al MET^^^ 


Note that another side effect in performing step a) is that the entire schedule is 
stretched, and, consequently, the slack time available for the dynamic scheduler is 
increased, since some of the timing critical operators don’t need noore tinv. to execute. 

b) Update the Software Base with the correct timing information for that 
component 


c) Reset the CPU Speed Ratio to its original value and take cither step d), c) or f) 
to solve the problem. 

d) If requirements permit,change the PSDL specification to allow the bigger MET 
found in step a). This in turn will require a whole new CAPS session, starting fix)m a new 
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translation until the final compilation. Note that increasing the MET affects the load 

factor and may cause an unfeasible schedule. 

e) Search the Software Base for another reusable component that matches the 

original MET. This new one may well have the correct information. 

f) Create another new component or optimize the existing component Validate its 

timing constraints and update the Software Base if succesfuU. 

g) If it is realized that a faster target processor is needed in order to cope with the 
requirements, then the Target Reference should be changed so that those timing errors 
disappear. Note that this change will only affect the CPU Speed Ratio, and as explained 
earlier, and will not change the schedule. Theoretically, the necessary change for the 
Target Reference can be derived very easily from the following formula: 

New Target Reference = New CPU Speed Ratio x Original Host Reference 
The other source of timing errors is found when dealing with user-created 
components. In other words, the component just created takes more time than that 
specified. For example, assume the previous situation, where a component with MET of 
120ms is required. Since the host machine is slower than the target machine, the 
scheduling time will be linearly stretched by a factor of 1.33, that is, 1.33 x 120ms, or 
159.6ms, will be allowed for the execution interval of this component. If timing errors 
occur, the following steps can be taken to eliminate them: 

a) Increase the CPU Speed Ratio until the error disappears. This means that a 
reasonable MET for that component with re^ea to the Target reference, althou^ it may 


not be the tightest one, is equal to: 

New CPU Speed Ratio ^ 

Npw MET =- - -Ongmal MET 

MewMti.^^^ Old CPU Speed Rano 

b) Reset the CPU Speed Ratio for its original value and take either step c) or d) to 
solve the problem. 

c) If requirements permits, change the PSDL specification to allow the bigger 
MET found in step a). This in turn will require a whole new CAPS session, starting from 
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a new translation until the final compilation. Again, this change may cause an unfeasible 
schedule. 

d) Rewrite the component trying to speed it up; 

e) If it is realized that a faster target processor is needed in order to cope with the 
requirements, then the Target Reference should be changed so that those timing errors 

disappear. The required change for the Target Reference can be derived from the 
following formula: 

New Target Reference = New CPU Speed Ratio x Original Host Reference 

f) After getting rid of the timing errors, if it is decided to add the user-created 
component to the software base, the component should be associated with an METcaps 

equal to METt^c . 

TargetREp 

4. How the CPU Speed Ratio affects Scheduling 

The Static Schedule is basically a sequence of pairs of absolute values containing 

e start time and stop time for each instance of the time-critical operators within one 
harmonic block. 

At the beginning, the static scheduler task calls the function TARGET_TO_HOST, 
which belongs to the package PSDL.ITMERS, and multiplies all those absolute time' 
values the CPU Speed Ratio. The net effect is that the scheduler wiU stretch or shrink 
all of the timing information related to the prototype in a linear fashion. 
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Figure 5.15. Effect of the CPU Speed Ratio on the Schedule 

5. Handling Unwanted Interactions during Prototype Scheduling 

A software prototyping environment needs to simulate external entities so that the 
entire system being prototyped can be exercised. These external entities will in most cases 
either generate inputs or consume outputs from the core of the system being prototyped. 
This requires that the timing constraints are taken into consideration during the generation 
of the schedule. However, it is during prototype execution that the effects are most 
harmful, since they wiU incorrectly steal CPU time from the host system. It is also 
unavoidable that time is spent by the host operating system to serve processes that 
sometimes nothing have to do with the prototype. 

All these unwanted interactions can dramatically affea timing behavior and overall 
confidence in the prototype. The question to ask, then, is how can these tiirring 
interferences be eliminated? 

To solve these problems, CAPS introduced the technique of having two different 
time lines. One is the absolute time line, and is driven the real-time clock of the host 
machine. The second one, the simulation time, wiU command all the scheduling actions of 
the prototype. 






What is going to happen is that whenever an external operator, or some operating 
system function, is being executed, the scheduling clock will be frozen, so that, for the 
prototype, it is as though they do not exist. 

Another feature that can be explored with this technique is when an operator 
belonging to the prototype exceeds its scheduling interval and causes an exception. It is 
very likely that this will interfere with other operators, causing a chain of exceptions, when 
in reality, only the very first operator incurred a timing error. Because of the use of a 
simulated clock (the scheduling clock) it is posable to remove any excess of time from the 
scheduling clock, and then resume the simulation, so that no further operators will be 
affected. 
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VI. EXPERIMENTAL RESULTS 


A. INTRODUCTION 

Although the full implementation of the new Distributed Model is not con:q)lete, 
due to software limitations of the current Ada con^iiler technology that will be solved 1^ 
the new Ada95 implementation, much can be said about expectations and also about the 
general scheduling capability of CAPS. 

One of the biggest problems encountered during this research was the lack of an 
adequate set of prototypes to test the schedule. Up to now, most of the development in 
CAPS has been tested with a few prototypes that may be sufficient for the development of 
several tools, but not for the scheduler, which requires a huge test set so that all the 
critical points can be exercised. This is the reason for building a PSDL random graph 
generator, as discussed in the next section of this chapter. 

B. THE RANDOM GRAPH GENERATOR 

The random graph generator has the following basic features: 

1) builds PSDL prototypes with an arbitrary number of operators 

2) allows the user to specify how many different prototypes are to be generated 

3) provides an expert mode where the system attempts to reduce the harmonic 
block automatically, by changing the periods of the periodic and the 
transformed sporadic operators within an user-defined range 

4) operates in two randomization modes: unlimited or restricted randomization 

5) provides a con^mssion capability, so that an arbitrary number of operators may 
be located within a bounded load factor of one. This is very useful for testing 
uiuprocessor scheduling algorithms 

6) allows the user to specify the desired percentage of timing critical operators, 
periodic operators, and data flow edges 

7) can generate prototypes with different degrees of sparseness 

8) the user can specify the maximum number of edges between two operators 
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9) provides a thorough scheduling information for debugging purposes 
There are basically two major procedures that build the random graph. The first 
one is the Produce_Random_Array and the second one is the Produce_Random_Matrix. 
Both routines use the same data structure of the scheduler, so that the simulation is as 
close as possible to the real prototype. 


^OffiRATORU 

raeoni 


^EIXS_INPOI« 

wmmrd 


IHR^OPERATORJD: 

OnBRATOR_]D Pt A-STRINOxa^pQr; 

ORKHN: 

INTEGER :b-1; 

THE^BT: 


BEST: 

1NTE(^:--1: 
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MRT:>0; 
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Tm_MCP: 


ORLD: 

INTECBR:-4; 

1HEJ>EIU0D: 
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TIBjLATBNCY: 

LATENCY :-0: 

1l«_WnHIN; 
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BOOLEAN :-&ke; 

ACnJALJ>ER10D: 

FBR]OD:«0; 

OVERLAPJVBLE: 

BOOLEAN r-ttw; 

LOWBRJ’ERIQD: 

IBRIODr-O; 

I1AS_STA1E.^DGE: 

BOOLEAN :-Uw; 

UFPBRJ’ERKX): 

raUODt-FERlCXnttt; 

IMCJER^SEC 

HjOAT:«QuO; 

THE.SLKXS: 

NODfiJJSTJ^:«iiiU; 

IMCJNDEX: 

RX>AT:-0l0: 

LOAD_FACT: 

FLOAT:-00; 

FK.lNDeX: 

n^AT:->99i): 

FAN IN: 

INTBCSRr-O; 

OONNECnVITY: 

INTECSRr-Q; 

PAN.OUT: 

Cttdraeord; 

lNXBCSaR:-Q; 
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Figure 6.1. Partial View of the Data Structure Used to Build the Random Graph. 


The first procedure, Produce_Random_Array, is the one that actually randomly 
assigns the timing constraints to the random prototype. It has two nxxies of operation. 
The first one uses a partial randomization, in the sense that only values from a pre-defined 
set are assigned to the timing constraints. The second mode uses a full randomization, so 
that any value within a finite range previously specified can be assigned. 

It is in this procedure where most of the information provided by the user, such as 
number of prototypes to be generated, number of operators in each prototypie, percentage 
of tinting critical operators, mode of randomization, percentage of periodic operators, and 
compression factor are used. 

In the current in:q}Iementation, the restricted randomization mode generates five 
possible different values for MET (100,300,500,700, and 1000) and four values for each 
of the remaining timing constraints PER. FW, MO^ and MRT, which arc dependent upon 
the previous chosen value for the MET. TTtis was done in order to assure semantic 
compatibility with a valid PSDL prototype. 
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If one opts for unlimited randomization, then no restriction is imposed on timing 
constraints, rather than limiting their values within a reasonable range, which now stands 
between 0 and 8(XX) ms. 

The random number generator being used has a period of ^proximately 2*^, so in 
order to achieve better results it is not reset after the generation of each different 
prototype. 

The expert mode is a fecility that allows the user to automatically reduce the final 
harmonic block length of the prototype, substantially increasing the schedulability of the 
prototype. For more in depth information, refer to Chapter m. Section E. 

The compression factor is used so that, if the prototype happens to have a load 
factor bigger than one (which would mean that it couldn’t run in a uniprocessor system) 
then the timing constraints are going to be confessed accordingly. This feature allow us 
to test huge prototypes for uniprocessors that otherwise, due to the random nature of the 
graph, would be very hard to achieve. 

The second main procedure, Produce_Random_Matrix, is where artificial edges 
are randomly generated according to the degree of sparseness and the maximum number 
of edges defined by the user. It is also here where the latency for each edges is generated. 

C FIRST FINDINGS AFTER USING THE RANDOM GRAPH GENERATOR 

The first finding after using the random graph generator was that the scheduling 
capability of the existing CAPS scheduler is very poor. It is not likely that the scheduler 
will find a feasible schedule for a moderate size prototype without manual adjustment of 
all timing constraints after a long and tedious process of trial and error. But that is not 
really bad because, after all, the static scheduling problem is a well known NP-Hard 
problem. The interesting dting, however, is that even for very small prototypes, with as 
few as 4 or 5 operators, and also a very limited number of edges, it sdU couldn’t find a 
feasible schedule, even through the use of traditional and widely accepted algorithms, such 
as earliest start time first and earliest deadline first, modified for the non-preemptive 
case. The question to be asked is, “Why does that happen, and how can we improve it?’’. 
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After meticulous analysis of several runs, with hundreds of random prototypes, it 
was determined that, on average, the earliest deadline first algorithm finds a feasible 
schedule for prototypes with load factors less than 0.5. It was also noticeable that the 
schedulability of the prototype was affected somehow by the harmonic block length (HB). 
There were some cases where, even with load factors over 0.95, after optimizing the HB 
to smaller values, it was possible to find a feasible schedule, which could not be achieved 
with the bigger HB. The load factor definitely has a strong influence on schedulability. 
For the harmonic block, however, it was not thought that the influence would be so great 

There are two readily apparent explanations for the harmonic block syndrome. 
The first is because of the higher number of instances that can fit in a bigger HB, the 
probability of having two or more tasks fighting for the same time slot increases. The 
second explanation is partially supported by Theorem 6 in Chapter m, where it is evident 
that, by increasing the period of an operator, which might happen when its period is 
optinaized, it also has an effect of increasing the probability of finding a feasible schedule. 

The following problems are now apparent: First, how to decrease the load factor 
of our prototype, and; second, how to decrease its associated harmonic block. 

The total load factor of the prototype cannot be changed much, since it comes 
ftom the user’s requirements. Splitting them into muldple processors will not do much 
good in the current practice for non-preemptive static distributed scheduling, which 
requires a global schedule for the entire prototype in order to satisfy all synchronization 
requirements. 

In order to change the harmonic block, assuming that the METs cannot be 
changed, it is necessary to modify the periods, but recall that they are constrained by the 
user’s requirements. However, if we take a close look at these problems it is possible to 
realize that they are quite different 

Assume that the requirements allow for making little changes in the periods, which 
is a fair assumption, since in most of the systems it does not really matter if the period of 
some task is ICXX) ms or 1010 ms. So the effect of such period change on the load factor 


136 


is clearly veiy small, while for the harmonic block it may represent a very big change, since 
it may get rid of some prime factor that was driving up the least common multiple (LCM) 
of the periods. Following this line of reasoning a novel technique to decrease the 
harmonic block was discovered, and will be described in the next section. 

D. MINIMIZING THE HARMONIC BLOCK 

The need for a harmonic block comes from the fact that, unlike most of the 
problems in classical scheduling, this periodic task set contains an infinite number of 
instances. Therefore, in order to calculate a static schedule for the task set, it is necessary 
to find a time interval which can be repeated forever. When the completion time of the 
first instances are restricted to be less than or equal to the pmods, it is common for the 
harmonic block to be the least common multiple (LCM) of the periods for such an 
interval. However, when those restrictions to the deadlines do not apply, it has been 
proven in Chapter HI Section C that it is sufficient to increase the time interval to twice 
the LCM. At any rate, the point to be made is that in both cases the size of the LCM is 
critical and, for the reasons explained in the previous section, it is desirable to make it as 
small as possible. 

Formally, the least common multiple of two natural numbers i and j is the smallest 
natural number that is divisible by both i and j. It is also known ftom Number Theory that 
every positive integer can be written uniquely as the product of primes, where the prime 
factors are powers of some positive integer. 

From the above definitions, it can clearly be seen that the LCM of two natural 
numbers i and j will have in its prime factorization all of the prime factors of the original 
numbers raised to the maximum exponent, as shown in the following example. 

Example: 

i=120 =2^x3x5 

j = 1(X) = 2^ X 5^ 

LCM(ij) =2’x3x5^ = 600 
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This same approach can be extrapolated to a case where several numbers are 
present, instead of only two. So now the problem is decreasing the LCM of a set of 
periods. 

There are two basic approaches. The first one is trying to decrease the factor with 
the biggest prime, and die other is decreasing the biggest prime factor. Qearly, the second 
approach is more eiqiedient, but sdll leaves the following problem. Suppose all of the 
periods which are contributing for the factors in the LCM are identified, and have been 
placed into a critical list, with some kind of mapping to the factors they are affecting. 
Now, assume that the period which is contributing for the biggest factor is changed. With 
luck, that biggest factor may be eliminated. However, the exponent of some other prime 
factor fiom that same period may be increased, now becoming the critical one for the 
LCM. In other words, it is necessary to re-evaluate the critical list and the corresponding 
mapping after each iteration of the optimization process, or one may end up with a non- 
optimal solution. 

After this brief description of the problem statement, it is possible to introduce the 
algorithm for optimizing the LCM, which is presented in Hgure 6.2. 
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Algorithm Optimize_LCM 

For every period calculate its prime factors; 

Calculate the initial LCM for the periodic task set and its prime factors; 

Set the flag LCM is decreasing to false; 

While there exists a prime factor of the LCM not yet optimized loop 

Insert those tasks whose periods are contributing fcx- the LCM factors into the 
Critical List in decreasing order of their contribution. In other words, the head of 
the Critical List will be the task with the biggest contribution to the LCM; 

While the Critical List is non-empty loop 

Pick the task which is in the head of the Critical List; 

Remove its cmitribution from the LCM; 

For each period within its allowable range loop 
Calculate the new LCM; 

If LCM is decreasing then record this period as the best one so far, 
end loop; 

If LCM is decreasing then 

Update the new LCM and the ta^ prime factors 
end if; 

Remove this task from the Critical List; 
end loop; 

if LCM is decreasing then 

It means that come critical task in the Critical List had its period changed 
and consequently reduced the LCM. Now is the subtle part, even if we had 
some period in the Critical list that couldn’t have its biggest factor 
changed, so that the LCM could decrease, it needs to be reconsidered, since 
the order in which the Critical List was scanned matters!! In other words, 
after all the others in the Critical List have been processed, it may well now 
be possible to change that same task so that the LCM will be decieased. So, 
we need to calculate the new LCM and start all over again, 
else if LCM is not decreasing 

Means that none of the critical tasks in the Critical List were able to get rid 
of their biggest factor, and so there is nothing else to do other than skip to 
the second biggest factor, and so forth, 
end if; 

Set LCM decreasing flag back to false; 
end loop; 

end Algorithm Optimize.LCM; 

Figure 6.2. Algorithin for Optimizing the LCM 

Although its optimality has not been formally proven, it is believed that this 
algorithm will always lead to near-optimal results. By applying this algorithm to some 
random task sets it was possible to tremendously reduce the harmonic block, with some 
positive effects in schedulability. It should also be noted, by the examples shown below, 
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that the periods are of critical importance. With very few changes in the periods an 
enormous decrease in the LCM can be achieved, with consequently few effects on the load 
factor. 




NBWFBRIOD 

1 
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100 
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500 
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600 
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Figure 6,3. Optimization Results 
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E. THE NEW DISTRIBUTED SCHEDULING ALGORITHM - SOME 
RESULTS 

After running several hundreds prototypes with typical values for the timing 
constraints (such as MET. MRT, MCP and PER) it was possible to make several 
conclusions in addition to those already cited in the previous sections. One of them, and 
actually the main driving force for directing us to distributed scheduling, was the palpable 
necessity for prototypes with load factors Inggcr than 1.0, specifically in our applications 
domain. 

Another major point discovered after this research is the real need for supporting 
and advising the real-time system designer, mainly with respect to the values for the timing 
constraints. Remember that non-preemptive static scheduling is a well known NP-Hard 
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problem, so that unless P=NP, there is not much hope of finding better ways to solve this 
problem. That is why, sometimes, in prototypes with only two nodes, it was impossible to 
find a feasible schedule. 

So, what is really needed is to find better ways to live with this problem. One of 
the ways to accomplish this is by providing better support in the area of schedulability 
tests, which is also a known NP-Hard problem. That is why several theorems were 
presented in Chapter m, which, it is hoped, will help in finding and pin-pointing some of 
the problems in the user’s design. 

It is possible by making use of those theorems to suggest changes in the timing 
constraints of a set of tasks, or even in a specific task, to suggest different partitions so 
that some taglcs are kept together due to the similarities of their timing constraints, etc. 

Now the scheduler can handle prototypes with load factors bigger than one, by 
applying the allocation algorithm described in Chapter V. The user can either ^ecify the 
maximum load factor allowed per processor, or the number of processors. It is also 
capable of generating a schedule, if one can be found, by using a distributed version of the 
Earliest Deadline Rrst algorithm. By making use of the Fundamental Synchronization 
Theorem it is now possible to divide the schedule into several smaller schedules, so that its 
complexity is tremendously decreased. 

The robustness of the new scheduler is enhanced due to the large testing that was 
posable by the random graph generator. Several important bugs were found during 
these experiments. It was possible to analyze and compare the performance of the 
different uniprocessor scheduling algorithms currently inqrlemented in CAPS. The output 
generated hy the scheduler is now mcne comprehensive, improving the debugging 
capability. 

An expert mode is provided to the designer, so that the hamronic block will be 
decreased with some effects on the load factor. A possible enhancement for the expert 
mode is to combine it with the actual scheduling. In other words, instead of applying the 
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optimization algorithm to the entire task set in only one step, prior to the scheduling, an 
attempt should be made to schedule the task set after each optimization iteration. 

As can be seen, quite a lot has been accomplished towards a more dependable and 
reliable scheduler, but much more needs to be done so that CAPS can become a true 
design aid to real-time system designers. 
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vn. CONCLUSIONS AND RECOMMENDATIONS 


A. SUMMARY OF THE DISSERTATION 

This dissertation can be roughly divided into three parts. The first part (Chapters I 
through ni) presents a review of the most in^rtant results in the area of hard real-time 
scheduling and introduces several theorems to improve the schedulability analysis of task 
sets containing both periodic and sporadic tasks. The effects of precedence relationships 
among the tasks on these theorems is also analyzed. Although most of the work was done 
for the non-preemptive nKxiel, several results are also ^plicable to the preemptive case, 
as highlighted throughout the dissertation. The second part of the dissertation (Chapter 
IV) introduces the novel method of hard real-time distributed scheduling without explicit 
synchronization. The motivation for this new approach is the complexity of the hand real¬ 
time scheduling problem, where for even small size systems running in a uni-processor 
environment, it is extremely hard to find a feasible schedule. )^fith the addition of one 
more variable, such as distributed processing, the general scheduling problem becomes 
intractable, and unless P=NP, there is no reason to foresee any solution to this problem. It 
was therefore decided to sacrifice timing constraints in order to decrease the complexity of 
the scheduling problem. Depending on the application, this approach may not be 
^plicable. However, this approach should work in most cases, especially in prototyping, 
which is usually in the early stages of the life cycle of the system, allowing for the fine 
tuning of timing requirements. The third pan of the dissertation deals with the 
architectural aspects of implementation of a distributed real-time scheduler without 
making use of any explicit synchronization. The following paragraphs present a summary 
of the salient results found in each chapter. 

Chapter I highlights the increasing demand for real-time systems in life-critical 
areas that were heretofore unexplored. Some basic definitions for hard real-time systems 
are also introduced, and a taxonomy for scheduling is proposed. Past research in real-time 
scheduling is reviewed and the major results arc listed in tabular form. A brief note shows 
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that the complexity of scheduling algOTithms for a non-periodic task set, which are solved 
in polynomial time, become eiqionential when dealing with periodic task sets. Some 
complexity results for message routing in hard real-time distributed systems are also 
presented. 

Chapter n presents a brief discussion of the Computer Aided Prototyping System 
(CAPS) which is a software engineering tool for developing prototypes of real-time 
systems. The Prototyping System Description Language (PSDL) and its facilities for 
modeling real-time systems are also described in this chapter. 

Chapter HI formalizes the real-time scheduling problem for periodic and sporadic 
task sets. It starts by introducing the scheduling model that will be used throughout the 
dissertation, and proceeds with the presentation of several theorems for improving the 
schedulability analysis of tasks with hard deadlines. The three most important results in 
this chapter are established by Theorems 6, 7, and 8. The Task Demand Theorem 
(Theorem 6), specities necessary conditions for task sets with arbitrary deadline and 
release times to be schedulable. It is also shown that if release times are taken into 
consideration, due to precedence relations, for example, the conditions are no longer 
necessary, but only sufficient Theorem 7 extends this result and proves that any penodic 
or sporadic task set satisfying the conditions of Theorem 6 can be scheduled with the 
Earliest Deadline First (EDF) algorithm, thus making the conditions specified in Theorem 
6 necessary and sufficient The Harmonic Block Theorem (Theorem 8) introduces the 
novel concept of transient and cyclic schedules, which is an enhancement of the traditional 
method for calculating a cyclic schedule, if one exists. It is shown by example that this 
latter method improves the schedulability of task sets which were found to have no 
feasible schedule by the traditional method. Later in the chapter all previous results are re¬ 
analyzed for the case where precedence relationships exist among the tasks. Theorem 8 is 
also extended to handle the simation where latencies are involved in the scheduling. Note 
that the net effect of introducing latencies in the problem is that the schedule can no longer 
be assumed to have no inserted idling time in the interval [OJ-CM]. Finally, a 
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methodology to convert spKjradic operators into equivalent periodic ones is presented, 
along with some important considerations about this conversion. 

Chapter IV presents an in-depth discussion covering all possible aspects of the 
communication involving two PSDL operators connected by some kind of data stream. 
The synchronization problem between producers and consumers is carefully analyzed, as is 
the underlying meaning of mi sang a deadline within the context of a real-time system. 
The conclusion reached is that missing deadlines are always attached to data that is not 
generated or consumed in the proper timing. This data approach for the synchronization 
problem will lead to the new distributed scheduling model with no explicit 
synchronization, which is formalized the Fundamental Synchronization Theorem 
(Theorem 9). The application of this theorem allows each set of tasks allocated to a 
particular processor to be treated as a totally independent set, provided that some more 
stringent timing constraints are satisHed. This approach will greatly decrease the 
scheduling complexion of large distributed real-time systems, although it may be applicable 
as well to cases involving uni-processors or shared memoiy multiprocessors. At the end 
of this chapter are some considerations about the allocation model implemented for the 
distributed scheduler in CAPS. 

Chapter V presents the current implementation of the CAPS uni-processor 
scheduler and it also proposes an architecture for implementing the full version of the 
distributed scheduler. It describes two options for implementing the distributed version. 
The first is to use the currently available C libraries for implementing the communication 
sub-system. Several problems with this approach are also addressed. The second option 
relies on the availability of a full Ada95 compiler, which, according to the Ada95 
Reference Manual’s Annex E, will support commuiucations between tasks ruiming in 
different processors. In the last section of this chapter several interesting considerations 
are presented regarding the timing problems involved in a typical software prototyping 
environment Topics such as simulated time, normalized reference for time infonnation, 
timing errors, and why they happen are covered in this section. 
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Chapter VI presents experimental results of the partially implemented distributed 
scheduler in CAPS. The random PSDL graph generator, which was one of the important 
factors for a better understanding of the scheduling problems in CAPS, is described. 
F inall y, an important issue is discussed which is not given enough attention by most of 
researchers, name ly, the least common multiple (LCM) of the periods of a periodic task 
set, which ultimately will determine the size of the cyclic schedule for the task set. It is 
demonstra t e d that, by making minor changes in the original penods, the final LCM and, 
consequently, the solution space of the corresponding scheduling problem can be 
drastically reduced. 

Chapter Vn is the conclusion, but it also proposes some modifications for CAPS, 
so that it can become a more dependable and reliable design tool for building real-time 
systems. 

B. POSSIBLE CAPS MODIFICATIONS 

As a result of this dissertation, several weaknesses and areas requiring 
improvement within the entire CAPS and PSDL were identified. Many errors in the static 
scheduler were corrected, but others require further effort. 

1. Enhancing the CAPS Syntax Directed Editor (SDE) 

As discussed in Chapter FV, several semantic checks for the input PSDL program 
are currently enforced by the scheduler. It seems reasonable, however, to allow most of 
these checks to be enforced by the SDE. This approach would allow the user to detect 
and receive warnings about the design in the early stages of prototyping. In doing so, the 
designer would not have to go all way back to the SDE when a semantic error was found 
by the scheduler. 

2. Tasks with Soft Deadlines 

In CAPS there arc only tasks with hard d eadlin es (TC), or tasks with no deadlines 
at all (NTC). In real-time systems however, there are often a third kind of deadline, but if 
it is missed for some reason it does not cause any harm to the system. This is known as a 


146 



“soft deadline”. Right now for example, an NTC operator can starve for a long time 
before its execution. This was certainly not the intention of the designer when the 
operator was placed in the prototype. This anomaly happens because the Non-Time 
Critical operator (NTC) depends on the time left the static scheduler, which can be 
none if the load factor is 1.0, and all the TC operators use their entire MET. 

The implementation of tasks with soft deadlines or some other approach, like the 
time-value functions presented in Chapter I, would greatly improve the scheduling 
capability of prototypes in CAPS. 

3. Preemptive Static Scheduling 

So far this option has not been used in CAPS because of the ADA83 tasking 
naodel, which prevents tasks with higher priority to change their relative position in the 
FIFO queue of a rendezvous. ADA95 however, allows dynamic changes in the queue 
according to their priority and, therefore, the preemptive model again becomes a valid and 
reasonable option for the CAPS scheduler. Note that, in general, the preemptive 
scheduling problem is easier to deal with than the non-preemptive one, allowing much 
better scheduling results. Further research is needed, but it appears that allowing a 
mixture of preemptive and non-preemptive tasks is the best approach available. 

4. Triggering Conditions versus Stream Types 

Currently, in the PSDL model a saiiq)led stream does not guarantee that the data is 
not lost or replicated. In the same noodel, however, the stream type is determined fiom 
the triggering condition of the consumer operator, c.g., an operator with a TRIGGERED 
BY SOME condition is supposed to guarantee that its output is based on the most recent 
value of the input sampled stream, which is to some extent a contradiction. Our 
suggesticHi is to separate triggering ctHiditions fiom the type of the streams, so that there 
can be a more orthogonal grammar for PSDL. A sampled stream should be defined as the 
scream where the data can be read zero or more times, whether in a data flow stream it can 
be read once and only once. It is understood that this definition better conveys the real 
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meaning of a stream, since a stream by itself should not guarantee whether or not the data 
is lost; the stream is simply a mechanism to transfer data. 

Once the idea of separating triggering conditions from stream types is accepted, it 
is necessary to check which are the valid combinations. These combinations are presented 
in Table 7.1, and should be considered valid unless an exception is noted 




TBIGGEREXt 

hVSOMS 

NO^GGER 


OK 



' SAMPLED SntE^ 


OK 

OK 1 


Table 7.1. Triggering Qmdition and Stream Type Combinations 


(1) Assume an operator A TRIGGERED BY ALL X,Y, where X and Y are 
sampled streams. Suppose data arrived only in X. It is necessary to wait for new data in 
Y, but after A is fired both pieces of data are consumed, and the old data caimot be used 
again, otherwise it is impossible to know which data is new or old, and therefore the 
existence of this case does not make sense. The only situation where this combination 
would be needed is if combinations of TRIGGERED BY SOME and TRIGGERED BY 
ALL arc allowed to exist for the same operator. Note, however, that this combination can 
always be implemented in two steps and with one additional operator. 

(2) Assume an operator A TRIGGERED BY SOME X,Y, where X and Y are 
data flow streams. Suppose only X gets new data. Operator A will fire and consume the 
data in X, leaving nothing behind because it is data flow. When new data comes in Y, 
there is nothing in X, and an underflow will occur. 

(3) It docs not make sense, because if there is no trigger, how can the consumer be 
guaranteed to always catch new data that comes into the data flow? 

5. Estimating the Execution Time 

As explained earlier, the MET is an upper-bound on the execution time of an 
operator, and it is this value which is used by the scheduler to generate the static schedule. 
Therefore, everything that can be done to decrease the MET is going to have a direct 
effect on the schedulability of the prototype. It would be nice if it were possible to, at run- 
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time, keep track of the real amount of time needed by each operator, so that feedback 
could be given to the user about its real MET for further update of the Software Base. 

6. The Uninitialized Sampled Stream Problem 

Suppose there is a non-time critical operator (NTC) connected to a time critical 
operator (TC) by a sampled stream. Qearly, the TC operator may be fired at least once 
before the NTC operator, and therefOTe it will read garbage from the sampled stream. 

This problem is aggravated in distributed scheduling, as shown by the example in 
Figure 7.1. 



Hgure 7.1. The Unirutialized Sanq)led Stream Problem 

Note that this exanqjle does not cause any problem in the uni-processor case, but 
in distributed scheduling, if OPj and OP 2 are assigned to different processors, OP 2 may 
fire before 0P|, and an unirutialized sampled stream will be read. A proposed solution 
would be to force the sampled stream to be declared as a state stream whenever an initial 
value is needed. 

7. State Stream versus Data Flow 

It does not make sense to have an operator TRIGGERED BY ALL X, if X is, for 
exan^le, a state stream. The reason for this is that values carried by state streams should 
always be available, and in a data flow stream the value is consumed after it is read, and no 
longer available. A warning should therefore be given if this happens in a PSDL program. 
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C. CONCLUSIONS 


This dissertEtion shows that hard real-time systems and, more specifically, hard 
real-time scheduling, are areas which are far from being totally explored. The next 
generation of hard real-time systems will be extremely large, complex, and most certainly 
distributed. They will be truly distributed, without any need for synchronization among 
processors. 

Most of the work so far in this area has been concentrated on finding better 
scheduling algorithms, without concentrating on the real need for synchronization. 
Deadlines arc always attached to data not being generated or consumed in a timely 
fashion. This dissertation is the first work ever done in the area of distributed scheduling 
without any explicit synchronization, and it is hoped that it will mark a turning point in the 
distributed scheduling field. It is far fiom being coir^ilete, but it does provide a totally 
different perspective on the distributed scheduling problem. 

Finally, this dissertation offers the following scientific contributions: 

1) A new model for distributed scheduling without synchronization; 

2) Several theorems on the schedulability of periodic and sporadic task sets, 
improving the state of the art in the scheduling field; 

3) A general Timin g Model for Pototyping Systems, which will enable interaction 
with different time references, keeping total consistency throughout the design; 

4) A method for optimizing the schedule length of periodic task sets. This 
tqiproach will decrease the time spent in scheduling and improve the chances of 
finding a feasible schedule; 

5) Making use of recent theoretical results in scheduling, they have been adapted 
to the model in this woric in order to support a systematic and formal method 
for the design, synthesis, and validation of timing constraints in hard real-time 
systems. 
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More specifically related to CAPS, the following contributions can be listed as 
additional results of this dissertation; 

1) Enhancement of the existing CAPS Prototyping System with a new Distributed 
Scheduler with; 

• allocation cq)ability 

• increased reliabiliQr 

• better schedulability 

• and an expat mode 

2) A Random PSDL Graph Genoator. 
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